weavejester / compojure

A concise routing library for Ring/Clojure
Eclipse Public License 1.0
4.09k stars 256 forks source link

Middleware at one point in the route tree can break middleware later in the route tree by reading the request body #165

Closed glittershark closed 7 years ago

glittershark commented 7 years ago

So just to start out, I'm not sure if this is

Essentially the gist of the issue is that while pretty much all of the Ring request structure is immutable - making it safe for handlers in various parts of a compojure.core/routes stack to do whatever they want with/to a request - that immutability doesn't apply to the InputStream of the body; the seek position in the stream is stored as mutable state inside the request. This means that a badly-behaved (or even innocuous) middleware outside a route matcher can slurp up the request body, setting the position of the input stream to the end, making it appear to all successive handlers in the route stack as if the request body was empty. I just spent the last couple hours or so hammering against a bug that turned out to have been caused by this.

I'm curious if compojure.core/routes can work some magic on the body such that every handler argument gets its own, immutable seek position -- it would certainly have saved me some headaches.

glittershark commented 7 years ago

If you'd like I'd be willing to put together a patch for what I'm suggesting.

weavejester commented 7 years ago

This isn't a bug, but rather a compromise between immutability and the reality of writing efficient network code. Requests can potentially be very large, so we don't necessarily want to consume them immediately. However, this also means that reading them is both side-effectful and destructive.

Any solution to this problem has trade-offs, so in general you just need to be careful and bear in mind that reading the InputStream is a destructive operation.

glittershark commented 7 years ago

I understand the impetus for not consuming the whole request body right away - but would it be possible to store the request body once something consumes it, then provide a cursor over that body to other parts of the route tree?

There'd be a memory tradeoff, so maybe it should be opt-in.

weavejester commented 7 years ago

It's certainly possible to store the body, but that opens up a new set of questions.

For instance, are you going to store the request body in memory, or on disk? Do you want to store small request bodies in memory, and large request bodies on disk? What's the maximum size that you'll allow before dropping the request?

Once you have the body stored, how are you going to represent it? As an InputStream that can be reset? That would be backward compatible, but would require middleware and routes to call .markSupported and .reset on the stream before consumption.

These are not insurmountable problems, but they do increase the complexity of the system. Rather than a developer asking the question, "Is the stream consumed or not?", they have potentially more questions to ask, like: "Is the cache full?" and "Is this middleware resetting the stream?"

Most of the time storing the request body is also unnecessary, as middleware that consumes the body tends to be at the top level. If the client sends us JSON, for example, we can detect it and then parse it into a data structure before our routes access it.

However, it's certainly possible to write some middleware that does store the request body, and in certain situations this could be very useful. However, it's also something that's unrelated to Compojure, which is just a routing library. What you're talking about is probably best solved through third-party Ring middleware.