ipfs / specs

Technical specifications for the IPFS protocol stack
https://specs.ipfs.tech
1.17k stars 232 forks source link

IPFS Gateway not exposing content-length header when receiving Accept-Encoding #281

Open vasco-santos opened 2 years ago

vasco-santos commented 2 years ago

At the moment IPFS Gateway does not send content-length header in response when it receives Accept-encoding headers. This is problematic for caching that can only be performed when we know content-length and browsers fetch default to send such headers.

Taking into account https://ipfs.io/ipfs/bafybeib4zuumguq4cgkt7caddeukb4ysijnehvntekrtu5afmet5ujvlka/package.json , we get content-length with CURL, but not browser.

image image

@lidel found out that Accept-Encoding request header was the root problem.

curl -sD - https://ipfs.io/ipfs/bafybeib4zuumguq4cgkt7caddeukb4ysijnehvntekrtu5afmet5ujvlka/package.json | grep -i content-length
curl -sD - -H 'Accept-Encoding: gzip' https://ipfs.io/ipfs/bafybeib4zuumguq4cgkt7caddeukb4ysijnehvntekrtu5afmet5ujvlka/package.json | grep -i content-length

References:

cc @lidel @thattommyhall @gmasgras

lidel commented 2 years ago

@vasco-santos Mind elaborating on "caching that can only be performed when we know content-length" ?

I think that statement is inviting leaky abstractions, just to be sure we operate on the same terminology, quick reminder:

My naive understanding is that:

vasco-santos commented 2 years ago

caching that can only be performed when we know content-length

Yeah, the object storage cache we use (Cloudflare R2) requires content length to put data. From CF discord there were these related messages that explain the reason. It is also a limitation on their end:

This means that for us to perform a request to the gateway and proxy it to R2 does not work for mentioned content types, unless we read all the content and create a new Response to put to R2. This is not feasible in some environments, like CF Workers as we are limited to 128MB.

Per https://fetch.spec.whatwg.org/#forbidden-header-name , I think we won't be able to overwrite content-length in the client.

thattommyhall commented 2 years ago

Sounds like you need to request it uncompressed then?

vasco-santos commented 2 years ago

@thattommyhall Accept-Encoding is a forbidden header name https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Accept-Encoding , I tried to set it to double check, but browser sadly does not pick it.

thattommyhall commented 2 years ago

Can you not do it in the CF worker though? Isnt that your runtime?

vasco-santos commented 2 years ago

Can you not do it in the CF worker though? Isnt that your runtime?

Just tried it, and also impossible to mutate that header, it will always become accept-encoding":"gzip". This should follow Fetch Spec