storacha / w3up

⁂ w3up protocol implementation
https://github.com/storacha-network/specs
Other
61 stars 22 forks source link

Wire index/add handler to write derived DUDEWHERE index #1402

Closed Gozala closed 5 months ago

Gozala commented 6 months ago

Context

Freeway utilizes set of R2 buckets to provide read interface

How it works:

  1. Extract DATA_CID from URL.
  2. Lookup CAR_CID(s) in DUDEWHERE.
  3. Read indexes from SATNAV
  4. UnixFS export directly from CARPARK using index data to locate block positions.

What

Derive and write DUDEWHERE index records from the dag index passed into #1401

Why

Otherwise uploads content uploaded through blob/add will be readable via Freeway

vasco-santos commented 6 months ago

Please note: today we write these indexes on upload/add https://github.com/w3s-project/w3up/blob/main/packages/upload-api/src/upload/add.js#L40

My first inclination was, we should remove that from upload/add and add here. But then we break the old store/add flow . Probably we will need to consider upload/add to receive both CARLink as shards and multihashes and distinct what to do there? as in, one does as of today, while new writes the b58btc encoded multihash after dataCID

What do you think @Gozala ?

Gozala commented 6 months ago

I have not considered upload/add and now I wonder if index/add subsumes that functionality or if we do need both 🤔 In terms of what to do I see following options to choose from:

  1. We issue upload/add without any shards from new clients as they will be using blob/add & index/add anyway.
  2. We issue upload/add but with RAW cid shards and than in the handler we can omit non CAR links.
  3. We don't do any upload/add, but surface things added via index/add from upload/list.

From where I stand first option seems most rational, but given a good argument I can see a second as good candidate also. Third option seems too drastic and I would prefer to do 1st or 2nd now and consider doing 3rd in the future followup.

Gozala commented 6 months ago

Thinking bit more about it I think in the future upload list should simply be a list of CBOR objects like { root: Link, parts: Link<BlobAddReceipt>[] } where root is a DAG root and parts are receipts for it's parts. Perhaps parts should be links to invocations instead of receipts so it could represent in-progress uploads also.

alanshaw commented 5 months ago

Note: we decided to not do this, and freeway uses materialized location claims instead.