storacha-network / w3infra

🏗️ Infra for the w3up UCAN protocol implementation
Other
15 stars 5 forks source link

Expire carpark objects after replicator runs #330

Closed vasco-santos closed 2 months ago

vasco-santos commented 6 months ago

After we successfully replicate CAR over from S3 to R2, we should be able to safely GC S3 object with https://github.com/web3-storage/w3infra/pull/323 merged in

We make all our read interfaces use R2 today directly, even Hoverboard (which sees block level indexes and attempts R2 bucket before the actual index present in S3). The only risk we have is if the indexer lambda fails over and over again until we GC the file. Which is a problem that we would have anyway as we would likely not know that if failed and it ends up getting expired out of the queue.

We can set a lifecycle using the JS API, such as:

const lifecycleConfiguration = {
  Rules: [
    {
      ID: 'ExpireRuleReplicated',
      Prefix: objectKey,
      Status: 'Enabled',
      Expiration: {
        Days: 5,
      },
    },
  ],
};

to expire it after replication. Sadly, this would need a prefix / lifecycle per new file... Alternatively we could make just a rule for each file that gets into carpark-prod-0, even though if Replicator fails for some reason we get into a critical state.

alanshaw commented 2 months ago

I created a lifecycle rule that expires objects 30 days after creation.

We have no read path that uses the data in carpark and with the introduction of the blob protocol we have deprecated and will eventually drop writing to AWS carpark altogether.