whatwg / fetch

Fetch Standard
https://fetch.spec.whatwg.org/
Other
2.11k stars 330 forks source link

Option to fetch raw byte as is. (without decompressing) #1524

Open jimmywarting opened 2 years ago

jimmywarting commented 2 years ago

There is a need for proxy servers to simply pass data forward from A -> B without decompressing the data as it would invalidate the content-length and content-encoding (Like a CORS proxy server for instance)

So we need an option to disable transformation.

annevk commented 2 years ago

Is this a request for server implementations of fetch()? Seems better suited for the Winter CG.

jimmywarting commented 2 years ago

yea pretty much... thought i would bring it up here first doe...

annevk commented 2 years ago

I think we'd need some pretty compelling use cases to consider this as it would be somewhat non-trivial to do as I understand it.

donaddon commented 1 year ago

Here is a compelling case for a Chrome extension:

  1. We have a Chrome extension that emulates other browsers using a remote host to do the rendering.
  2. Because the rendering host might not have access to the local client's VPN, we allow an option wherein the extension in the local client implements a tunnel so the HTTP requests from the rendering host can be sent back through the client, thus able to retrieve resources from within the VPN.
  3. We need to be able to then send the uncompressed responses back to the rendering host.
  4. We have done this successfully with XMLHttpRequest and this works for many scenarios, but now service workers are required, thus Fetch is required.
  5. The alternatives are not good, like re-compressing those responses using a gzip compression library. Or mucking with the headers to send a decompressed response, but this was causing other issues that we did not pursue, because it is clearly not desirable. Or, send the request over to a tab page (which we would have to open if there isn't one), have the tab page make the request using good old XHR, then send the response back to the service worker, then back to the rendering host. Again, yuck.
jimmywarting commented 1 year ago

Another use case could be to do partial download of something that's encoded and also supports range request.

Say that i want to download something really large. content-encoding and content-length is provided along with accept range response.

I initiate a call

const response = await fetch(url, { 
  method: 'GET',
  raw: true,
  headers: {
    'accept-encoding': 'gzip, deflate'
  }
})

From now on i will know

but i will not know what the actual data is unless i pipe it to a new DecompressionStream('gzip | deflate')

const progress = document.querySelector('progress')
const chunks = [] // ultra simple store

for await (const rawChunk of response.body) {
  // show how much have been downloaded (not how much have been decompressed)
  progress.value += rawChunk.byteLength

  // store the chunks somewhere
  chunks.push(rawChunk)
}

With this in place i can provide a good solution for failed downloads. By calculating exactly how much i have downloaded. That way i can make i range request and continue on from where i left of or when the connection failed. This would also be a good solution for pausing / resuming a download.

now that i have all chunks then i can go ahead and decompress it using the DecompressionStream

unfortunately we lose some very useful stuff with this raw option. can't use brotli decoding (due to lack of support in decompressionStream) text(), json(), arrayBuffer() and response.body are not so useful anymore cuz it require more work afterwards.

another option would be to be able to hook in and inspect the data somehow before it's decompressed. so a alternative solution could be to do something like

const response = await fetch(url, {
  onRawData (chunk) {
    // ...
  }
})

// alternative considerations

const response = await fetch(url)
const clone = response.clone()

response.json().then(done, fail)
clone.rawBody.pipeTrough(monitor).pipeTo(storage) // consuming the rawBody makes `clone.body` locked and unusable.

So i can say that i found two additional use cases beside a server proxy. 1) progress monitoring, 2) pausable / resumable download

Enet4 commented 10 months ago

One other use case: If I wish for an application to cache the compressed response data with a custom storage layer, it would have been convenient if the application could take the data as encoded by the server and push it into the cache directly. At the moment, one can only grab the data in its decompressed form, which would either waste space in the cache or require the application to re-compress it.

ricea commented 9 months ago
  1. We have done this successfully with XMLHttpRequest and this works for many scenarios, but now service workers are required, thus Fetch is required.

Out of curiosity, how did you do it with XMLHttpRequest? I didn't think that was possible.