storacha-network / w3infra

🏗️ Infra for the w3up UCAN protocol implementation
Other
13 stars 5 forks source link

feat: roundabout gets raw cids as blobs #359

Closed vasco-santos closed 3 months ago

vasco-santos commented 3 months ago

Part of https://github.com/w3s-project/project-tracking/issues/49

Note that currently Roundabout is used in production traffic for SPs to download Piece bytes, and is planned to be used by w3filecoin storefront to validate a Piece CID.

SP reads

  1. SPs request comes with a PieceCID, where we get equivalency claim for this Piece to some content.
  2. In current world (store/* protocol), it will in most cases be a CAR CID that we can get from R2 carpark-prod-0 as carCid/carCid.car. However, store/add does not really require this to be a CAR, so it could end up being other CIDs that are still stored with same key format in R2 bucket.
  3. With new world (blob/* protocol), it will be a RAW CID that we can get from R2 carpark-prod-0 as b58btc(multihash)/b58btc(multihash).blob.

w3filecoin reads

  1. filecoin/offer is performed with a given content CID
  2. In current client world, a CarCID is provided on filecoin/offer. This CID is used to get bytes for the content, in order to derive Piece for validation. In addition, equivalency claim is issued with CarCID
  3. With new world, we aim to have filecoin/offer to rely on RAW CIDs, which will be used for both reading content and issuing equivalency claims.

This PR

We need a transition period where we support both worlds.

This PR enables roundabout to attempt to distinguish between a Blob and a CAR when it gets a retrieval request. If the CID requested is a CAR (or a Piece that equals a CAR), we can assume the old path and key format immediately. On the other hand, if CID requested is RAW, we may need to give back a Blob object or a "CAR" like stored object.

For the transition period, this PR proposed that if we have a RAW content to locate, we MUST do a HEAD request to see if a Blob exists, and if so redirect to presigned URL for it. Otherwise, we need to fallback into old key formats. As an alternative, we could make the decision to make store/add handler not accept anymore non CAR CIDs, even though we would lose the ability to retrieve old things from Roundabout (which may be fine as well 🤔 ).

Please note that this is still not hooked with content claims to figure out which bucket to use, and still relies on assumption of CF R2 carpark-prod-0. Just uses equivalency claims to map PieceCID to ContentCID

seed-deploy[bot] commented 3 months ago
View stack outputs - **pr359-w3infra-BillingDbStack** Name | Value -- | -- customerTableName | pr359-w3infra-customer spaceDiffTableName | pr359-w3infra-space-diff spaceSnapshotTableName | pr359-w3infra-space-snapshot usageTable | pr359-w3infra-usage - **pr359-w3infra-BillingStack** Name | Value -- | -- ApiEndpoint | https://awazcqnrb9.execute-api.us-east-2.amazonaws.com billingCronHandlerURL | https://qaatxsq2wj7753vtppxvhxel2a0isjht.lambda-url.us-east-2.on.aws/ CustomDomain | https://pr359.billing.web3.storage - **pr359-w3infra-CarparkStack** Name | Value -- | -- BucketName | carpark-pr359-0 Region | us-east-2 - **pr359-w3infra-RoundaboutStack** Name | Value -- | -- ApiEndpoint | https://kjzhvk5h2j.execute-api.us-east-2.amazonaws.com CustomDomain | https://pr359.roundabout.web3.storage - **pr359-w3infra-SatnavStack** Name | Value -- | -- BucketName | satnav-pr359-0 Region | us-east-2 - **pr359-w3infra-UploadApiStack** Name | Value -- | -- ApiEndpoint | https://q4ieky41li.execute-api.us-east-2.amazonaws.com CustomDomain | https://pr359.up.web3.storage - **pr359-w3infra-BusStack** - **pr359-w3infra-FilecoinStack** - **pr359-w3infra-ReplicatorStack** - **pr359-w3infra-UcanFirehoseStack** - **pr359-w3infra-UcanInvocationStack** - **pr359-w3infra-UploadDbStack**
Gozala commented 3 months ago

Created PR to restrict store API to CARs so no new non-car content can not be stored via store API https://github.com/w3s-project/w3up/pull/1415