Open lidel opened 2 years ago
Only thing I would add is a strategy of expiring at "top of the hour" or "end of the day" for the DirIndex pages makes dealing with when the etag does eventually change a bit nicer (you wont see different things on different servers for as long), which nudges me to prefer Expires to Cache-Control (but its not that much extra to calculate it anyway)
An idea in addition to the ones proposed: A public resolution endpoint for gateways could be useful as an inexpensive, cacheable call. There is no public endpoint (AFAIK) to verify resolutions for IPNS public keys or IPFS subpaths without fetching the whole file from a gateway, so an endpoint or query param exposing resolution could be helpful.
@mathew-cf thats useful, but you'd have to rewrite to fetch /ipfs/<cid>/<path>
or you couldnt be certain that the content you fetched matched. Like if I checked what the dnslink was for blog.ipfs.io then fetched /ipns/blog.ipfs.io
it might have changed in the meantime
By cacheable, I meant for a short TTL (~1 min). This resolution response can still be supplemented by DNS resolution for faster cache verification/invalidation. If you have the x-ipfs-roots header, you can check that the resolution and response match and choose whether to cache the response.
I'm thinking in the context of using Cloudflare workers here if that helps clarify the context.
@mathew-cf Thoughts on leveraging existing HTTP HEAD
responses for these quick checks?
They are inexpensive, don't return any payload, only headers, and do not require any new endpoint.
HEAD request resolves the content path, and for unixfs DAGs fetch the bare minimum (root block of a file/directory) to get the size for Content-Length
header.
HEAD response could include additional metadata headers, for example:
@lidel That works for me!
I tested something like your dir-index-html strategy with lua in nginx, but I used X-Accel-Expires
to keep it within our gateway.
Nginx will honor it, I think maybe Varnish but I cant find a clear ref (though I am near-certain there is an equivalent)
So configuring the name of the header might be useful is the concrete ask here, but someone else that uses another cache might be able to chime in
If we fix https://github.com/ipfs/go-ipfs/issues/1818#issuecomment-1015849462 (set proper cache-control
header of /ipns/
and dir-index-html responses), that would be universal hint for all HTTP caching tools/solutions (in case where gateway operator wants to cache things for longer, the minimal max-age could be raised via config, decreasing the need for custom headers like X-Accel-Expires
)
@lidel did fix/dir-index-html-max-age land somewhere?
@thattommyhall kinda: https://github.com/ipfs/go-ipfs/pull/8758 fixed a bug and now adds cache-control
when a directory has index.html
and returns it instead of dir listing response.
Generated dir listings don't have cache-control
header, but they have deterministic Etag
+ will return HTTP 304 Not Modified if client sends matching Etag in If-None-Match
header.
I'd like to advocate for something like top of the hour Expires or something in the DirIndex case too. It's nice not to have to re-ask the backend at least for a short while
@thattommyhall I am leaning towards setting Cache-Control
that asks for caching forever (immutable
), because dir listing is costly to generate, and we don't change them that often.
Given how people deploy gateway infra, we would still want to revalidate on CDN/caching proxies, so how about:
Cache-Control: public, max-age=31536000, s-maxage=604800, stale-while-revalidate=86400, stale-if-error=86400, immutable
(https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cache-Control)
My understanding is the caching proxy will always return a cached version of dir listing, but will try to revalidate if the cached copy is older than a week.
Thoughts?
This is a meta issue for HTTP cache improvements that we should prioritize in go-ipfs:
max-age
on/ipns/
based on TTL – https://github.com/ipfs/go-ipfs/issues/1818/ipfs/
should have some reasonablemax-age
(it won't change within releases)/ipns/
responses which contains resolved IPNS (root CID) for faster cache invalidation of entire website when IPNS pointer changesX-Ipfs-Roots
in https://github.com/ipfs/go-ipfs/pull/8720 a way to indicate all CIDs required for resolving path segments fromx-ipfs-path
cc @thattommyhall & @mathew-cf) if there are more asks/ideas here