yutakahirano / fetch-with-streams

Fetch API integrated with Streams.
66 stars 8 forks source link

chunked uploads are not possible with http 1.x protocols #57

Closed wanderview closed 7 years ago

wanderview commented 8 years ago

While trying lay some groundwork for chunked stream uploaded in gecko I ran across an issue. According to @mcmanus we cannot reasonably providing chunked uploads for HTTP 1.x servers. While HTTP 1.1 does support chunked, we can't determine if 1.1 is supported until after the request is sent. That means we have to support HTTP 1.0 in the request upload which does not permit chunked.

Based on this we need some way for a fetch to indicate if buffering the upload before the request is acceptable or not. Maybe by explicitly setting a chunked encoding header. Or maybe if the body is set as a ReadableStream, its automatically marked as chunked. We would then fast fail chunked uploads on http 1.x servers.

Thoughts?

@domenic @yutakahirano @annevk

domenic commented 8 years ago

Wait a minute. I don't think the conclusion follows here. We can just always send TE: chunked. If the server is HTTP 1.0, it can't deal, and oh well. But if it's HTTP 1.1 onward, it's fine.

domenic commented 8 years ago

Oh, I guess that is what your second paragraph is about. The "we cannot reasonably provide chunked uploads for HTTP 1.x servers" in the first paragraph threw me off.

Yes, automatic chunked sounds good.

wanderview commented 8 years ago

I guess I worry about HTTP 1.0 servers that don't send any error code until the entire body is uploaded. This could be a problem for js script that expects to send an "infinite" upload stream.

Not sure if thats a real concern for servers, though. We need to investigate.

wanderview commented 8 years ago

Also, there is probably a high failure rate for chunked uploads in HTTP 1.1 servers. Its probably not exercised much at all right now. For example, gecko currently never sends chunked uploads.

annevk commented 8 years ago

So assuming buggy HTTP/1.0 servers, when would this be a problem?

Note that we'll only do chunked uploads for streams, and I think we should do so automatically, everything else must continue to be content-length based.

wanderview commented 8 years ago

You might have to invent some kind of protocol for your service that first checks that the server is capable.

I believe this is called HTTP 2. If we negotiate H2 during the TLS handshake then we can send the chunked upload stream.

Personally I think we should fast fail the request for chunked upload if HTTP 2 is not negotiated.

annevk commented 8 years ago

Why would H2 have chunked implemented though? And what is wrong with simply trying?

domenic commented 8 years ago

That would be a real shame. HTTP/1.0 is a tiny minority (trying to find stats...) and should not deny the basic capability of chunked uploads to authors. Authors would then have to just work around this restriction by proxying back to their HTTP2 server and having their server do chunked uploading.

Also, there is probably a high failure rate for chunked uploads in HTTP 1.1 servers. Its probably not exercised much at all right now. For example, gecko currently never sends chunked uploads.

It's worth investigating other browers, but even leaving browsers aside, there is plenty of server-to-server software that does chunked uploading. (All Node.js clients, by default, for example.)

Cross-origin with unknown origins: this sucks.

How much does it suck? Does it suck worse than trying to do a request over a one-bar intermittent 2G connection? Authors have to be prepared to deal with requests failing or timing out. It's not a big deal.

mcmanus commented 8 years ago

On Fri, Sep 11, 2015 at 9:45 AM, Anne van Kesteren <notifications@github.com

wrote:

Why would H2 have chunked implemented though?

all h2 flows are comprised of flexibly sized frames (aka chunks) and contain an explicit close bit (aka zero chunk). content-length is strictly advisory for consumers of the stream and isn't part of the protocol delimiter. (there is a separate flow in each direction for each transaction).

wanderview commented 8 years ago

Why would H2 have chunked implemented though?

Because H2 is designed for all uploads to effectively be chunked. Every upload in H2 requires framing, etc. Its a core part of the protocol.

That would be a real shame. HTTP/1.0 is a tiny minority (trying to find stats...) and should not deny the basic capability of chunked uploads to authors.

HTTP/1.1 is not a small minority, though. I think we need to investigate how well 1.1 servers handle chunked uploads. Given that its not used today, there are probably a lot of bugs.

annevk commented 8 years ago

How much does it suck?

It's just that debugging would be hard and fixing it would involve talking to (potentially unknown) third parties. I don't think it's a blocker personally.

annevk commented 8 years ago

@wanderview I still don't understand the concern. If the server doesn't work with chunked your application will simply not work until you update the server. What is the big deal?

wanderview commented 8 years ago

I still don't understand the concern. If the server doesn't work with chunked your application will simply not work until you update the server. What is the big deal?

Shouldn't we have consistent failure modes? I mean, do you want some servers to fail the request, others to fail to strip the chunked framing bits, and others to succeed? How would people use this feature in a library?

domenic commented 8 years ago

HTTP/1.1 is not a small minority, though. I think we need to investigate how well 1.1 servers handle chunked uploads. Given that its not used today, there are probably a lot of bugs.

I think it is used a lot more than you anticipate from the perspective of working on a browser that does not support it. But, I don't have data, just experience with frameworks and tools that support it.

wanderview commented 8 years ago

I think it is used a lot more than you anticipate from the perspective of working on a browser that does not support it. But, I don't have data, just experience with frameworks and tools that support it.

I don't disagree. I'm just asking for us to investigate that before baking in something that's not widely supported. If it is widely supported, great.

annevk commented 8 years ago

Shouldn't we have consistent failure modes?

I don't think so. The feature is opt-in, if your server is broken, tough luck. (We should of course handle whatever the server throws back at us consistently.)

domenic commented 8 years ago

I think the failure modes from the JS developer's perspective will always be one of two things: (1) your stream stays locked forever, or at least until timeout; (2) your stream gets canceled, either immediately or after some time. With non-infinite timeout this reduces to one thing.

The idea being that on failed uploads we cancel the stream. I guess that wasn't explicit anywhere yet.

wanderview commented 8 years ago

And there is no concern an HTTP/1.0 server would just slurp the entire chunked post body without removing the framing? There would be no indication back to the js code in that case.

domenic commented 8 years ago

We can make that case a network error for JS, and potentially terminate the upload the moment we see the HTTP/1.0 in the header. The server still gets crazy data (or can we cut it off early enough?), but JS at least gets a consistent experience.

domenic commented 8 years ago

I guess we could only cut it off early if the server supported 100-continue.

wanderview commented 8 years ago

We can make that case a network error for JS, and potentially terminate the upload the moment we see the HTTP/1.0 in the header. The server still gets crazy data (or can we cut it off early enough?), but JS at least gets a consistent experience.

We can return an error to js, but the server may have still mutated state.

mcmanus commented 8 years ago

a server that needs a content-length to parse the request will have to fail (or plausibly hang)

On Fri, Sep 11, 2015 at 10:04 AM, Ben Kelly notifications@github.com wrote:

And there is no concern an HTTP/1.0 server would just slurp the entire chunked post body without removing the framing? There would be no indication back to the js code in that case.

— Reply to this email directly or view it on GitHub https://github.com/yutakahirano/fetch-with-streams/issues/57#issuecomment-139555841 .

domenic commented 8 years ago

Good point. So by now we've narrowed it down to servers that (a) only support HTTP/1.0; (b) consume request bodies without first looking for Content-Length; (c) are not under your control, and (d) send appropriate CORS headers to allow you to talk to them.

I am hopeful that is a tiny portion of the web, especially the modern web that people will be trying to communicate with via fetch().

wanderview commented 8 years ago

I guess we can slap a warning box on the MDN page about using a ReadableStream for the Request.body.

annevk commented 8 years ago

Yeah, warning developers seems sufficient.

reschke commented 7 years ago

FWIW, you may want to have a look at https://greenbytes.de/tech/webdav/rfc7694.html

annevk commented 7 years ago

Thsi can be closed.