filecoin-project / lotus

Reference implementation of the Filecoin protocol, written in Go
https://lotus.filecoin.io/
Other
2.85k stars 1.27k forks source link

Implement a lotus-shed migration command to migrate existing indexes to chain indexer #12408

Open akaladarshi opened 2 months ago

akaladarshi commented 2 months ago

Checklist

Lotus component

What is the motivation behind this feature request? Is your feature request related to a problem? Please describe.

According to the discussion in #12293 and changes in PR, the decision was made to remove the fragmented indexes (msg, txhash, events) and create a single index (ChainIndexer). Therefore, we need to migrate the existing indexes to ChainIndexer.

Describe the solution you'd like

Create a lotus-shed migration command to migrate all the existing indexes to ChainIndexer

Describe alternatives you've considered

No response

Additional context

No response

rvagg commented 2 months ago

Right now @aarshkshah1992 and I are thinking that lotus-shed is the best place for this to live, for a few reasons:

All that being said, we may end up deciding this isn't a great idea and the migration should be inline during the upgrade. Perhaps if we couple it with new GC settings then migration doesn't have to be expensive at all because we only migrate the data after where a GC would delete your data anwyay.

So for now, lotus-shed is a good place to work on this, we may end up moving it in to the main daemon process later if we decide that's a better strategy.

akaladarshi commented 2 months ago

@rvagg

In a call with @aarshkshah1992, we decided to go with the migration cum backfill type of command.

It will start from the chain head and start backfilling to the new chainIndexer database, this command will backfill from the main store not from the existing DB this will make sure we have correct data in the new indexes (not GC'ed or pruned).

aarshkshah1992 commented 2 months ago

@rvagg Basically, we want to use the chain store/chain state to "migrate" rather than using the existing Indices to migrate.

This is because the existing Indices might have a lot of entries for which the corresponding state has already been GC'd and also the DDLs don't map 1:1 nicely to the new DDL.

Using the chainstore/chainstate as the source of truth for migrating/backfilling the Indices makes more sense and will get us a consistent Index.