storacha / freeway

🛣 Experimental IPFS HTTP gateway providing access to UnixFS data via CAR CIDs.
Other
14 stars 5 forks source link

Lassie compatible http api for fetching CARs #34

Open olizilla opened 1 year ago

olizilla commented 1 year ago

Offer an http api that Lassie / Saturn could use to fetch CARs from us. Tweak our existing CAR responses to match the Lassie spec.

We already support CAR responses, we just need to tweak the existing code to write traversed blocks to the CAR to make them verifiable, and handle sending subsets of the total dag when directed to by the ?car-scope param.

### Tasks
- [ ] #32
- [x] Ensure expected block ordering (see: https://github.com/filecoin-project/lassie/blob/main/docs/CAR.md#block-deduplication)
- [ ] #33
- [ ] https://github.com/web3-storage/w3up/issues/786
- [x] Test against lassie
- [ ] Dedupe repeated blocks in CAR response (see: https://github.com/filecoin-project/lassie/blob/main/docs/CAR.md#block-deduplication)
- [ ] Ensure Identity CIDs are not included as blocks (see: https://github.com/filecoin-project/lassie/blob/main/docs/CAR.md#dag-depth)
willscott commented 1 year ago
olizilla commented 1 year ago

@willscott is "Announce HTTP endpoint as an extended family member to the indexer" required before we can test against lassie?

willscott commented 1 year ago

I don't think so!

Lassie as a CLI App allows retrieval against a manually specified provider endpoint, skipping the indexer lookup.

olizilla commented 1 year ago

ah, i see, something something lassie fetch --providers...

willscott commented 1 year ago

you'll want to be on the http retrieval branch https://github.com/filecoin-project/lassie/pull/204 in order to test

cc @rvagg

olizilla commented 1 year ago

Success!? ✨ 🎷 🐩

~/Code/filecoin-project/lassie on rvagg/http  
❯ ./lassie fetch --providers /dns4/freeway.dag.haus/tcp/443/https/p2p/bafzbeibhqavlasjc7dvbiopygwncnrtvjd2xmryk5laib7zyjor6kf3avm bafybeihetumsy24mjzbts7t4vpft2rwput44joxxnzxhfh5woq6z46fe2y
Fetching bafybeihetumsy24mjzbts7t4vpft2rwput44joxxnzxhfh5woq6z46fe2y from [{QmQzqxhK82kAmKvARFZSkUVS6fo9sySaiogAnx5EnZ6ZmC: [/dns4/freeway.dag.haus/tcp/443/https]}]...........
Fetched [bafybeihetumsy24mjzbts7t4vpft2rwput44joxxnzxhfh5woq6z46fe2y] from [QmQzqxhK82kAmKvARFZSkUVS6fo9sySaiogAnx5EnZ6ZmC]:
    Duration: 809.277584ms
      Blocks: 11
       Bytes: 3.2 MiB

❯ ipfs-car --list-full bafybeihetumsy24mjzbts7t4vpft2rwput44joxxnzxhfh5woq6z46fe2y.car
bafybeihetumsy24mjzbts7t4vpft2rwput44joxxnzxhfh5woq6z46fe2y bafybeihetumsy24mjzbts7t4vpft2rwput44joxxnzxhfh5woq6z46fe2y
bafkreidpr7zz5yflyocmglpref5vvx4yglo3zmihh3mpiaezecgrggqwiq bafybeihetumsy24mjzbts7t4vpft2rwput44joxxnzxhfh5woq6z46fe2y/johnny-5-goes-camping.jpg
...etc

Minor: you get an error if you try and provide a more succinct http flavour multiaddr sans p2p like /dns4/freeway.dag.haus/tcp/443/https

~/Code/filecoin-project/lassie on rvagg/http  
❯ ./lassie fetch --providers /dns4/freeway.dag.haus/tcp/443/https bafybeihetumsy24mjzbts7t4vpft2rwput44joxxnzxhfh5woq6z46fe2y
2023-05-03T12:24:48.376+0100    FATAL   lassie  lassie/main.go:57   invalid p2p multiaddr
olizilla commented 1 year ago

cid+path works

./lassie fetch --providers /dns4/freeway.dag.haus/tcp/443/https/p2p/bafzbeibhqavlasjc7dvbiopygwncnrtvjd2xmryk5laib7zyjor6kf3avm bafybeihetumsy24mjzbts7t4vpft2rwput44joxxnzxhfh5woq6z46fe2y/johnny-5-is-cowboy.jpg
Fetching bafybeihetumsy24mjzbts7t4vpft2rwput44joxxnzxhfh5woq6z46fe2y/johnny-5-is-cowboy.jpg from [{QmQzqxhK82kAmKvARFZSkUVS6fo9sySaiogAnx5EnZ6ZmC: [/dns4/freeway.dag.haus/tcp/443/https]}]..
Fetched [bafybeihetumsy24mjzbts7t4vpft2rwput44joxxnzxhfh5woq6z46fe2y] from [QmQzqxhK82kAmKvARFZSkUVS6fo9sySaiogAnx5EnZ6ZmC]:
    Duration: 403.931916ms
      Blocks: 2
       Bytes: 443 KiB
olizilla commented 1 year ago

--car-scope file is working: only directory block returned for unixfs dir.

❯ ./lassie fetch --providers /dns4/freeway.dag.haus/tcp/443/https/p2p/bafzbeibhqavlasjc7dvbiopygwncnrtvjd2xmryk5laib7zyjor6kf3avm --car-scope file bafybeihetumsy24mjzbts7t4vpft2rwput44joxxnzxhfh5woq6z46fe2y
Fetching bafybeihetumsy24mjzbts7t4vpft2rwput44joxxnzxhfh5woq6z46fe2y from [{QmQzqxhK82kAmKvARFZSkUVS6fo9sySaiogAnx5EnZ6ZmC: [/dns4/freeway.dag.haus/tcp/443/https]}].
Fetched [bafybeihetumsy24mjzbts7t4vpft2rwput44joxxnzxhfh5woq6z46fe2y] from [QmQzqxhK82kAmKvARFZSkUVS6fo9sySaiogAnx5EnZ6ZmC]:
    Duration: 390.517625ms
      Blocks: 1
       Bytes: 681 B

❯ ipfs-car --list-cids bafybeihetumsy24mjzbts7t4vpft2rwput44joxxnzxhfh5woq6z46fe2y.car 
bafybeihetumsy24mjzbts7t4vpft2rwput44joxxnzxhfh5woq6z46fe2y    
olizilla commented 1 year ago

How to announce what we have over http to the indexers needs some discussion as we only support dag roots via http today https://github.com/web3-storage/w3up/issues/786