Open guseggert opened 3 years ago
This would require a serious refactor of how https://github.com/ipfs/dir-index-html works (which we already want to do, but is a bigger adventure).
Given that we want to improve IPLD support on gateways (https://github.com/ipfs/in-web-browsers/issues/182), we should do this type of thing in a generic way that works for all DAG types, and lazy-load additional Size and Type information when the DAG is unixfs.
I envision replacing unixfs-specific dir-index-html with "IPLD Explorer v2" that shows generic DAG view by default, but has specially-crafted variants for most popular codecs like dag-pb (unixfs) and leverages something like ?format=unixfs-info
(https://github.com/ipfs/go-ipfs/issues/8234) for lazy-loading additional metadata about Size and Type only for items visible on the page.
I agree with the bigger picture, but this also seems relatively low effort and addresses an availability risk. Assuming the "quick fix" is straightforward, I think it makes sense to do both (quick fix now, generic fix later).
We should also add some metrics around this, because from what I can tell, we don't have good visiblity into how much this contributes to aggregate metrics like latency, TTFB, etc.
If we're just trying to make non-sharded directories faster, https://github.com/ipfs/go-ipfs/issues/8178 is probably a simpler short-term solution.
Eventually, we'll likely need pagination for sharded directories. But we'll need to add the ability to "seek" which will require some design work.
Some stuff discussed with Lidel that might be useful for consideration:
?page=0&limit=100
302
redirect pointing to the first "page"<form>
?)The pagination should try to use regular traversal and account for whatever ADLs exist at that point including HAMTs.
I also wrote some notes about an alternative approach, which essentially removes the need for pagination: https://github.com/ipfs/go-ipfs/issues/9058 and included both in HTML Gateway specs under best practices section (https://github.com/ipfs/specs/commit/9fc9a9c72fe538ab90b039da5c4025c368e300ba)
Checklist
Description
To render the directory listing page,
go-ipfs
sequentially fetches blocks for every directory entry. For large directories, this takes a very long time (see e.g. https://github.com/ipfs/go-ipfs/issues/7588). The gateway should paginate this listing so that there's a reasonable upper bound on the time it takes to return a response, and to allow more even load distribution across gateway fleets. I'd suggest some query args to control page size and offset, with an upper bound of 20 on the page size (this upper bound could be configurable).