Open kentonv opened 7 years ago
Hmm, it turns out that, to my surprise, browsers (Chrome at least) actually decompress files that have Content-Encoding
when saving to disk (e.g. due to Content-Disposition
). I always thought of Content-Encoding
as being something that's not supposed to be handled until the data is to be consumed, unlike Transfer-Encoding
which is clearly meant to be handled at the HTTP layer. It seems instead that browsers will automatically decode Content-Encoding
in all cases I can think of.
I guess that this argues that, for the case of ServiceWorkers and Response
s, my concern is moot. ServiceWorkers do not specify any means by which a Response
can be sent back out on the network. It only exists within the browser, and within cache which is managed by the browser. So the Content-Encoding
header from this point is just saying what encoding was used by the network transport (if the response ever crossed a network).
I'm working with a use case, though, where I'm trying to implement a proxy server which runs ServiceWorker-like code, so I actually need to figure out how to push a Response
out over the network.
I suppose even for ServiceWorkers, the same question comes up with the Request
object. What happens if I want to send a Request
with a Content-Encoding: gzip
body? Do I compressed the content before giving it to fetch()
? Or does it do it for me?
In my tests in Chrome, as expected, Request
does not automatically encode the body. If you specify Content-Encoding: gzip
, you must pass already-compressed data as the body
option to new Request()
(or to fetch()
).
This seems to be what the spec calls for, but it's unfortunate that it means Request
and Response
are inconsistent with each other.
Thoughts?
So your main problem is with step 16 of https://fetch.spec.whatwg.org/#concept-http-network-fetch I suppose. That makes some of the things inconsistent here. We could provide dedicated support for gzip somehow, similarly to how it's tightly coupled in browsers, but that wouldn't necessarily address future schemes, if any.
One thing we could do is provide a request flag to disable step 16, but last I heard that would be fairly involved. And it's also fairly low-level and still makes you responsible for any compressing work.
I think for now I will specify that Response bodies are expected to be decompressed, and if you specify a Content-Encoding on an outgoing Response, then the system is expected to apply the encoding for you when writing out to network.
Meanwhile, I can do clever optimizations where I can detect if a Response body passes through verbatim, and avoid the compression round trip in that case.
There are two annoyances:
new Request()
could take an additional init option called encodingAlreadyApplied
or something.What is an "outgoing Response"? Do you mean Request?
@wanderview No, I mean Response. I have a use case involving an HTTP proxy that applies ServiceWorker-like scripts, so Responses need to be serialized back out to the network.
Ok, but that is a bit different from fetch API as specified for the browser. This seems like something you could fix in your proxy implementation.
Yes, certainly in my use case I could deviate from the standard if needed. Just hoping to avoid it if at all possible. Less documentation to write that way. :)
And the Request / Response inconsistency strikes me as odd even in the browser case, though I suppose there's not much that can be done there at this point.
Request.body could probably be changed to honor content-encoding in a best-effort way if it made sense.
Hi, all.
I have a question on request encoding side. At this moment, there is no way to ask browser/runtime to encode the request content when calling fetch
API. Is that right?
When handling logging, browser/runtime native gzip encoding is very useful.
@lijunle hey, that is correct. I think we best track that as a distinct request. Would you mind filing a new issue? Please also mention whether you just want gzip or also other formats, such as Brotli.
@annevk Sure! Here it goes: https://github.com/whatwg/fetch/issues/653
I bet, all those Content-Encoding parsing should be left to implementors and external to fetch apis - no auto-encoding-decoding stuff. Because now you have "broti", tommorow will be "shmroti", or, maybe some "aes128gsm" and also chains like "gzip, smzip, bunzip, gzip_again"...
Section 4.6 step 16.1.2 specifies that
Content-Encoding
is handled (responses are decoded) beforefetch()
completes, so the returnedResponse
already has a decompressed body (butresponse.headers.get('content-encoding')
still returns the original encoding).This seems to create an inconsistency. What happens if I want to construct a
Response
manually that contains gzipped content? Say, for example, that I wish to construct aResponse
for the purpose of returning from aFetchEvent
handler in a ServiceWorker. If I say:What happens? I can imagine a few possibilities:
data
should be uncompressed. The implementation will automatically compress it according to theContent-Encoding
header, and the file downloaded will thus be compressed. (This is consistent, but weird, and it seemingly forces me to do a redundant decompress-compress round trip.)data
should be compressed, to be consistent with the header. The implementation will not modify the bytes when downloading. (But this is inconsistent withResponse
s that came from callingfetch()
!)Content-Encoding
header here is incorrect. ConsiderContent-Type: application/gzip
instead. (This seems generally unfortunate.)(I'm posting this question against the fetch spec since it is where the
Response
class is specified, but the problem only seems to come up in the context of ServiceWorkers.)