twitter / finagle

A fault tolerant, protocol-agnostic RPC system
https://twitter.github.io/finagle
Apache License 2.0
8.79k stars 1.46k forks source link

Streaming requests with a fixed content-length are received in one large chunk #538

Closed rklancer closed 5 years ago

rklancer commented 8 years ago

I'm trying to stream requests for certain routes in a Finatra server, in order to pipeline the contents to a downstream http service. But the streaming option has no practical effect because Finagle constructs a Netty HttpMessageDecoder that accumulates up to maxRequestSize bytes into a buffer before firing any events.

Expected behavior

In a Finatra route callback, with Finagle's streaming option configured (by specifying Finatra's streamRequest option), I expect to be able to obtain com.twitter.io.Reader that fills a buffer with bytes as they are received.

Actual behavior

When the Finatra route callback is called the entire contents of the request have always been read into the reader beforehand. A debugger session shows that Netty's HttpMessageDecoder does asynchronously receive bytes, but accumulates them into a buffer rather than immediately passing them up the pipeline.

Steps to Reproduce

I am developing a Finatra server (2.1.6, Finagle version 6.35.0). I set streaming to true by configuring the streamRequest option in the Finatra server. One particular route callback in the Finatra server is supposed to accept POSTed files from our client app and stream them to a downstream service, preferably without buffering the entire file in memory first.

I can reproduce by uploading a largish (<5MB) file via curl to the Finatra route mentioned above. (So, a fixed-length request, like our client app would send, with a content-length header and no transfer-encoding: chunked header.) The buffer is always filled before the route callback is invoked.

Discussion

As far as I can tell, this is because when com.twitter.finagle.http.Http constructs a Netty pipeline, it passes maxRequestSize as the maxChunkSize argument of SafeHttpServerCodec. This amounts to inserting a Netty HttpMessageDecoder that does not detect a frame and emit a messageReceived event until the entire message has been received.

See https://github.com/twitter/finagle/blob/3bd9661da354a30bbf7cf22a10bd854759beddd1/finagle-http/src/main/scala/com/twitter/finagle/http/Codec.scala#L291

Is it possible to decouple the chunk size and request size concepts? It seems the HttpChunkAggregator, which is added to the pipeline when streaming is false, would handle aggregating smaller chunks into the completed message for the non-streaming case.

Note: I'm mostly using Finagle via Finatra, and then at one remove via a Java wrapper library created at my company. I'm learning to read Scala, but not prepared to write any yet.

mosesn commented 8 years ago

@rklancer oh, interesting. so what you're wondering is if we could write bytes into the Reader as they're received, rather than buffer them? I think the long and short of it is that we use netty as our underlying http implementation, and netty takes http messages with a content-length header and fully buffers them, so it's simply not possible while using the common netty types. Part of the reason why we use netty's implementation is because it's battle-hardened, so we're pretty confident in it. We could consider reimplementing http in finagle, but it's an enormous undertaking, and seems to have limited benefits.

HttpChunkAggregator is a no-op in the non-streaming case because it only acts on chunked http.

In general, I think content-length is intended to be used for fully-buffered http requests, and chunked transfer encoding is intended to be used for streaming, and that's how finagle uses them. So I'm not sure there's a ton we can do on our end.

rklancer commented 8 years ago

Thanks for the response @mosesn!

So, I'm sold on Netty's battle hardened qualities and I don't expect Finagle to try to work around anything it doesn't support.

But ... I guess I'm not convinced that that Netty 3 actually requires buffering the whole http message. If you specify a maxChunkSize in the HttpMessageDecoder which is smaller than the Content-Length, it emits an HttpMessage followed by HttpChunks, just as it does for chunked-transfer encoding (look for the READ_FIXED_LENGTH_CONTENT_AS_CHUNKS state).

I assume that Finagle is fundamentally capable of accepting this sequence of Netty events (HttpMessage + HttpChunks) and incrementally filling the Reader's underlying buffer--or am I mistaken that this is what it does when it serves a chunked-transfer-encoding request with the streaming option enabled?

I'd say here that it looks like a simple matter of exposing an option to set a chunk size independently of the request size limit. But I find that, if I manually set the chunk size to less than the uploaded file size (in a debugger), the following hangs:

Readers.readAll(request.reader()).get();  // pardon the Java

Taking a closer look, I find that the Netty pipeline emits the HttpMessage and several chunks, but gets closed before the final chunk is emitted. Thus, presumably, the hang. I'm not sure why that would be, but I'm in danger of going down a rathole.

My bigger question is, should I continue going down this route? If it turns out to be a simple matter of tweaking the pipeline setup, would Finagle be interested in supporting a use case where streaming + a modest chunk size is used in order to allow proxying of large, but fixed-size, uploads?

Of course, I know a Netty 4 transition is in the works, so we can't get too deep here.

My specific use case is a Finatra server, as mentioned, which on the one hand interacts with other Finagle services, and on the other exposes an HTTP API to our mobile apps. They would send multipart requests with small application/json part and a potentially much larger "attachment" which we really just want to forward downstream. So we do have to process the client's request in app code, but buffering the whole client request before sending it downstream seems like it could be a scaling pain point. Especially given that we are talking about mobile clients, so we expect downstream to be able to drain bytes much faster than clients can send them.

Thanks!

mosesn commented 8 years ago

Oh, I see! That's a good idea. Yeah, as long as netty4 supports it, and you're down to implement it in both of them, I think we'd be OK with adding it.

rklancer commented 8 years ago

Thanks! I'm pretty sure Netty 4 supports that (one old comment from the netty mailing lists suggests HttpChunk was renamed to HttpContent in Netty 4 exactly because the name seemed to imply chunked encoding was necessarily being used)

Should we leave this ticket open? I'm not sure whether or when we would submit a PR. We're discussing a number of different options for the underlying problem and haven't established for sure that it will be a scaling problem.

mosesn commented 8 years ago

Works for me. I'll mark it as a community / starter ticket in the mean time in case someone wants to pick it up.

koiuo commented 6 years ago

I believe we can close this now

yufangong commented 5 years ago

Thanks, @rklancer. I believe @edio's fix https://github.com/twitter/finagle/commit/a3094f37478bde2ae599d4f45f8ce3631cce14a5 resolved this issue. And later we have https://github.com/twitter/finagle/commit/842e5e1a2b5613307add41fd064ebb589cc22bef to update withStreaming. I'm closing this now, feel free to open a new issue if the this doesn't work for you.