ChainSafe / forest

🌲 Rust Filecoin Node Implementation
https://forest.chainsafe.io
Apache License 2.0
638 stars 160 forks source link

Garbage collector should not collect manifests #4926

Open lemmih opened 1 month ago

lemmih commented 1 month ago

Describe the bug

Each network upgrade has a manifest. This manifest is stored in the blockstore on start-up. The manifests are unreachable from the blockstore and will, therefore, get garbage collected. The missing manifests will cause the next network upgrade to fail.

To reproduce

Run forest continuously for a week before a network upgrade.

Log output

Log Output ``` 2024-10-23T14:26:48.764982Z ERROR forest_filecoin::chain_sync::tipset_syncer: Syncing tipset range [2078795, 2078905] failed: Validation error: Processing error: Could not update state: manifest for network version NV24 not found in blockstore: bafy2bzaceax5zkysst7vtyup4whdxwzlpnaya3qp34rnoi6gyt4pongps7obw, Processing error: Failed to calculate state: manifest for network version NV24 not found in blockstore: bafy2bzaceax5zkysst7vtyup4whdxwzlpnaya3qp34rnoi6gyt4pongps7obw ```

Expected behaviour

Manifests should not be garbage collected.

Screenshots

Environment (please complete the following information):

Other information and links

Option 1 - Blessed column

Have a separate column for Manifest storage. Plug it into the get that does DbColumn::GraphDagCborBlake2b256 | DbColumn::GraphFull already. It's an extra read, but only when there is a miss on the above columns.

Pros

  1. Easy to implement
  2. Clear separation of concerns in terms of storage.

    Cons - negligible

  3. A little overhead to store manifests in the db.
  4. Extra read on storage miss when querying the data store, because we need to make sure that the API does not change.

Option 2 - Whitelist

Whitelist certain CIDs to prevent them from being GCed.

Pros

  1. No need to add an extra column.

Cons

  1. No separation of concerns.
  2. The code gets more convoluted.
  3. Extra checks for each marked record - performance degradation, though not that significant as it's just a simple "if" condition.

Option 3 - On-demand download of manifests before network upgrades

Download bundles as needed X epochs before the upgrade.

Pros

  1. No permanent extra storage costs.

Cons

  1. Lots of moving parts, more convoluted codebase.
  2. Potential problems with downloads/retries.
  3. Hard to test.
ruseinov commented 1 month ago

Is there an option to store manifests in a separate namespace so they don't get garbage collected ever?

lemmih commented 1 month ago

Yes, like our settings, the manifests should be stored in a separate column.

LesnyRumcajs commented 1 month ago

Decide on the option by implementor:

  1. whitelist items to not get GCd (from the manifest file)
  2. Additional blessed column (watch out for reads)
  3. download the manifest just-in-time
ruseinov commented 3 weeks ago

Going with blessed column as per team decision.