Open vasco-santos opened 2 years ago
@vasco-santos Mind elaborating on "caching that can only be performed when we know content-length" ?
I think that statement is inviting leaky abstractions, just to be sure we operate on the same terminology, quick reminder:
ipfs files stat /ipfs/{cid}
:
Size
– The size of raw data e.g. byte size of a raw block, or sum of Data fields in UnixFS File's DAG concatenated togetherCumulativeSize
The total size of entire DAG (raw data + metadata envelopes)Size
if the File was small enough to be inlined as raw leave (CIDv1 with raw
codec)Content-Length
HTTP header informs HTTP clients about the length of the payload
Size
of raw data when UnixFS file is sent without chunking, but that is not a solid foundation one should build on (content-length
will change based on compression, transfer encoding etc)My naive understanding is that:
ipfs.files.stat
call, to learn the Size
without fetching the entire thing.
X-Ipfs-DataSize
and X-Ipfs-DagSize
headers, so you don't need to do any additional ipfs.files.stat
call to read the sizecaching that can only be performed when we know content-length
Yeah, the object storage cache we use (Cloudflare R2) requires content length to put data. From CF discord there were these related messages that explain the reason. It is also a limitation on their end:
This means that for us to perform a request to the gateway and proxy it to R2 does not work for mentioned content types, unless we read all the content and create a new Response to put to R2. This is not feasible in some environments, like CF Workers as we are limited to 128MB.
Per https://fetch.spec.whatwg.org/#forbidden-header-name , I think we won't be able to overwrite content-length in the client.
Sounds like you need to request it uncompressed then?
@thattommyhall Accept-Encoding is a forbidden header name https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Accept-Encoding , I tried to set it to double check, but browser sadly does not pick it.
Can you not do it in the CF worker though? Isnt that your runtime?
Can you not do it in the CF worker though? Isnt that your runtime?
Just tried it, and also impossible to mutate that header, it will always become accept-encoding":"gzip"
. This should follow Fetch Spec
At the moment IPFS Gateway does not send content-length header in response when it receives Accept-encoding headers. This is problematic for caching that can only be performed when we know content-length and browsers fetch default to send such headers.
Taking into account https://ipfs.io/ipfs/bafybeib4zuumguq4cgkt7caddeukb4ysijnehvntekrtu5afmet5ujvlka/package.json , we get content-length with CURL, but not browser.
@lidel found out that Accept-Encoding request header was the root problem.
References:
cc @lidel @thattommyhall @gmasgras