storacha / freeway

🛣 Experimental IPFS HTTP gateway providing access to UnixFS data via CAR CIDs.
Other
14 stars 5 forks source link

Pathing beyond valid file doesn't return the file #88

Open rvagg opened 1 year ago

rvagg commented 1 year ago

Related to https://github.com/web3-storage/freeway/issues/84. A case where the path can't be fully resolved, but we should be able to fetch all the blocks up to the point where the path fails. The path resolves to a file but requests further segments (probably a badly encoded page or a bad client). Freeway returns the root block but not the file block which is the only link down from the root.

 bafybeiaylvitie6r7n6nbmcvvihkn33ylk5yqcfk7vgjawptn7hlpmwkqa | /bigfish.html/bootstrap.min.css?dag-scope=entity&car-scope=file

                                      | 6926
 bafybeiaylvitie6r7n6nbmcvvihkn33ylk5yqcfk7vgjawptn7hlpmwkqa | /bigfish.html/ipfs-404.html?dag-scope=entity&car-scope=file

                                      | 4114
 bafybeiaylvitie6r7n6nbmcvvihkn33ylk5yqcfk7vgjawptn7hlpmwkqa | /bigfish.html/jquery-1.js?dag-scope=entity&car-scope=file

                                      | 3277
 bafybeiaylvitie6r7n6nbmcvvihkn33ylk5yqcfk7vgjawptn7hlpmwkqa | /bigfish.html/jquery-1.js/ipfs-404.html?dag-scope=entity&car-scope=file
...
curl -v -H 'Accept: application/vnd.ipld.car;version=1;order=dfs;dups=y' 'https://dag.w3s.link:443/ipfs/bafybeiaylvitie6r7n6nbmcvvihkn33ylk5yqcfk7vgjawptn7hlpmwkqa/bigfish.html/bootstrap.min.css?dag-scope=block' -o w3s.car

Gives us the single root bafybeiaylvitie6r7n6nbmcvvihkn33ylk5yqcfk7vgjawptn7hlpmwkqa

curl -v -H 'Accept: application/vnd.ipld.car;version=1;order=dfs;dups=y' 'https://dag.w3s.link:443/ipfs/bafybeiaylvitie6r7n6nbmcvvihkn33ylk5yqcfk7vgjawptn7hlpmwkqa/bigfish.html?dag-scope=block' -o w3s.car

Gives us the root plus file, which is what it should.

As I mentioned in https://github.com/web3-storage/freeway/issues/83, we're discussing what to do with unfulfilable paths like this over here but currently we think that the server should just return what it can and it's up to the client to verify that it can't fully fulfil the path. So for Lassie in daemon mode serving these via HTTP like Freeway (and probably Frisbii), we'll just return the blocks that we get and ignore the unfulfilled path segments. But when you run lassie fetch on the command line, we'll give you the CAR but also an error and non-zero exit code that your path is unfulfilled. Not sure about what the gateway wants to do with these but I assume it should error 404.

Regardless, we should get the maximum blocks we can for these paths.

alanshaw commented 1 year ago

In the case where you're pathing past a raw block, ipfs-unixfs-exporter throws before fetching and yielding the raw block:

https://github.com/ipfs/js-ipfs-unixfs/blob/47218799b4c28b0a4b4dc5a71b9d981e5d803424/packages/ipfs-unixfs-exporter/src/resolvers/raw.ts#L29-L46

...and the the stream ends.