Open aarshkshah1992 opened 2 months ago
🧵 From Slack conversations: "It will take 9-10 days to backfill the ChainIdexer all the way back to FEVM, but it is a one time cost, and you can copy the index over to other nodes, so you only need to run the backfill operation on one node."
A few questions/thoughts on this:
Some notes from 2024-10-09 Lotus standup focused on the "~9 days to backfill a FEVM-archival node" topic:
@BigLep
🧵 From Slack conversations: "It will take 9-10 days to backfill the ChainIdexer all the way back to FEVM, but it is a one time cost, and you can copy the index over to other nodes, so you only need to run the backfill operation on one node."
A few questions/thoughts on this:
- What users do we need to proactively talk about this with? I know Glif is aware. Who else should we bring into this conversation?
- Do these users have multiple nodes so they can do a rolling upgrade?
- Related to number 2, is this a showstopper for any of these archival users?
- If this is a showstopper, what are our options?
- Bootstrap chainindex.db from the 3 existing sqlite dbs (and then quickly identify areas that are missing data and backfill them)?
- Have someone in the community generate chainindex.db and share with others (and include the accompanying verify commands)?
- ???
1) The long backfilling time is primarily a concern for archival nodes, not snapshot synced nodes. Protofire(Glif), Vulcanise and Blockscout are the three archival node operators I am aware of. Would love @eshon and @jennijuju to chime in if there are more. We've already proactively initiated conversations with Protofire/Glif about what's coming up. Ideally, we would deploy the ChainIndexer
on their archival node first to serve a portion of their traffic and once we get a green-light from them -> onboard other RPC providers.
2) Do these users have multiple nodes so they can do a rolling upgrade? I know Protofire and Vulcanise do. I am unsure about the others.
3) If a user is only running one archival node , here are the options:
Get a copy of the new Index from other RPC providers/community where there is a trusted relationship.
I would like to strongly push back against the idea of using the old Index (which suffers from multiple problems which prompted this workstream in the first place) to build the new one.
There is a non-trivial amount of engineering effort involved in doing this right and I am fairly confident that if we go down this path -> we will end up having to spend engineering cycles down the road on debugging correctness problems with the ChainIndexer
which are really happening because of the missing/inconsistent data in the old Index.
Also replied at: https://github.com/filecoin-project/lotus/pull/12450#discussion_r1794946530.
Another archival node provider is Zondax, let me share details later today with Jenni.
When you say "backfilling" do you specifically mean backfilling the FEVM indexes only would take 9 days?
Does this assume the node has already loaded all FEVM archival data since FEVM launch and is fully synced?
@eshon Yes this assumes that the node has already loaded all FEVM archival data since FEVM launch and is fully synced. "Backfilling" here refers to reading the chain state and indexing data that we need for faster RPC responses in the Index Database.
Results from testing on a dedicated Protofire FEVM Archival node. This node is doing nothing other than syncing the chain.
1) Backfilling 1 month of epochs backwards from the current chain head. Takes ~12 hours.
2024-10-08 18:06:43.525 starting chainindex validation; from epoch: 4336809; to epoch: 4250409; backfill: true; log-good: false
2024-10-08 18:15:49.508 -------- Chain index validation progress: 3.33%; Time elapsed: 9m5.98274048s
2024-10-08 18:27:27.114 -------- Chain index validation progress: 6.67%; Time elapsed: 20m43.58922645s
2024-10-08 18:42:42.489 -------- Chain index validation progress: 10.00%; Time elapsed: 35m58.963728548s
2024-10-08 19:01:34.272 -------- Chain index validation progress: 13.33%; Time elapsed: 54m50.747261985s
2024-10-08 19:27:53.144 -------- Chain index validation progress: 16.67%; Time elapsed: 1h21m9.618754411s
2024-10-08 20:06:49.629 -------- Chain index validation progress: 20.00%; Time elapsed: 2h0m6.103717312s
2024-10-08 21:10:58.370 -------- Chain index validation progress: 23.33%; Time elapsed: 3h4m14.844417783s
2024-10-08 22:17:20.862 -------- Chain index validation progress: 26.67%; Time elapsed: 4h10m37.337324591s
2024-10-08 23:26:31.600 -------- Chain index validation progress: 30.00%; Time elapsed: 5h19m48.07516203s
2024-10-09 00:31:51.979 -------- Chain index validation progress: 33.33%; Time elapsed: 6h25m8.453541436s
2024-10-09 01:58:04.654 -------- Chain index validation progress: 36.67%; Time elapsed: 7h51m21.128442883s
2024-10-09 03:06:59.404 -------- Chain index validation progress: 40.00%; Time elapsed: 9h0m15.878883989s
2024-10-09 03:19:06.227 -------- Chain index validation progress: 43.33%; Time elapsed: 9h12m22.702241843s
2024-10-09 03:29:00.946 -------- Chain index validation progress: 46.67%; Time elapsed: 9h22m17.420597166s
2024-10-09 03:38:47.714 -------- Chain index validation progress: 50.00%; Time elapsed: 9h32m4.189265746s
2024-10-09 03:48:33.692 -------- Chain index validation progress: 53.33%; Time elapsed: 9h41m50.167261601s
2024-10-09 03:58:44.708 -------- Chain index validation progress: 56.67%; Time elapsed: 9h52m1.183098448s
2024-10-09 04:09:45.871 -------- Chain index validation progress: 60.00%; Time elapsed: 10h3m2.346345951s
2024-10-09 04:21:08.180 -------- Chain index validation progress: 63.33%; Time elapsed: 10h14m24.654708182s
2024-10-09 04:32:44.268 -------- Chain index validation progress: 66.67%; Time elapsed: 10h26m0.742834532s
2024-10-09 04:43:09.888 -------- Chain index validation progress: 70.00%; Time elapsed: 10h36m26.36274386s
2024-10-09 04:51:30.369 -------- Chain index validation progress: 73.33%; Time elapsed: 10h44m46.843732873s
2024-10-09 05:02:44.664 -------- Chain index validation progress: 76.67%; Time elapsed: 10h56m1.138670683s
2024-10-09 05:14:33.169 -------- Chain index validation progress: 80.00%; Time elapsed: 11h7m49.644118179s
2024-10-09 05:26:52.491 -------- Chain index validation progress: 83.33%; Time elapsed: 11h20m8.965545335s
2024-10-09 05:39:28.663 -------- Chain index validation progress: 86.67%; Time elapsed: 11h32m45.138303295s
2024-10-09 05:51:50.451 -------- Chain index validation progress: 90.00%; Time elapsed: 11h45m6.925924816s
2024-10-09 06:03:02.344 -------- Chain index validation progress: 93.33%; Time elapsed: 11h56m18.819100394s
2024-10-09 06:15:01.300 -------- Chain index validation progress: 96.67%; Time elapsed: 12h8m17.774766296s
2024-10-09 06:26:34.288 -------- Chain index validation progress: 100.00%; Time elapsed: 12h19m50.762641635s
2024-10-09 06:26:34.305 -------- Chain index validation progress: 100.00%; Time elapsed: 12h19m50.779804039s
2) Backfilling 1 month of epochs post FEVM launch . Takes ~10 hours.
2024-10-09 06:34:36.198 starting chainindex validation; from epoch: 2769848; to epoch: 2683448; backfill: true; log-good: false
2024-10-09 06:54:32.847 -------- Chain index validation progress: 3.33%; Time elapsed: 19m56.648777171s
2024-10-09 07:13:29.590 -------- Chain index validation progress: 6.67%; Time elapsed: 38m53.391578991s
2024-10-09 07:31:37.937 -------- Chain index validation progress: 10.00%; Time elapsed: 57m1.738433863s
2024-10-09 07:53:53.763 -------- Chain index validation progress: 13.33%; Time elapsed: 1h19m17.564622641s
2024-10-09 08:17:20.598 -------- Chain index validation progress: 16.67%; Time elapsed: 1h42m44.400170981s
2024-10-09 08:38:23.602 -------- Chain index validation progress: 20.00%; Time elapsed: 2h3m47.403992297s
2024-10-09 08:59:40.515 -------- Chain index validation progress: 23.33%; Time elapsed: 2h25m4.31638391s
2024-10-09 09:22:41.837 -------- Chain index validation progress: 26.67%; Time elapsed: 2h48m5.638957169s
2024-10-09 09:46:41.586 -------- Chain index validation progress: 30.00%; Time elapsed: 3h12m5.387221278s
2024-10-09 10:09:15.496 -------- Chain index validation progress: 33.33%; Time elapsed: 3h34m39.29731905s
2024-10-09 10:30:27.827 -------- Chain index validation progress: 36.67%; Time elapsed: 3h55m51.628606445s
2024-10-09 10:51:02.016 -------- Chain index validation progress: 40.00%; Time elapsed: 4h16m25.817962431s
2024-10-09 11:13:19.400 -------- Chain index validation progress: 43.33%; Time elapsed: 4h38m43.201847276s
2024-10-09 11:35:17.255 -------- Chain index validation progress: 46.67%; Time elapsed: 5h0m41.0564808s
2024-10-09 11:58:17.438 -------- Chain index validation progress: 50.00%; Time elapsed: 5h23m41.240064043s
2024-10-09 12:19:09.401 -------- Chain index validation progress: 53.33%; Time elapsed: 5h44m33.202230962s
2024-10-09 12:39:43.318 -------- Chain index validation progress: 56.67%; Time elapsed: 6h5m7.120162996s
2024-10-09 13:00:36.205 -------- Chain index validation progress: 60.00%; Time elapsed: 6h26m0.007156519s
2024-10-09 13:22:07.533 -------- Chain index validation progress: 63.33%; Time elapsed: 6h47m31.334230385s
2024-10-09 13:42:22.805 -------- Chain index validation progress: 66.67%; Time elapsed: 7h7m46.606813157s
2024-10-09 14:02:50.702 -------- Chain index validation progress: 70.00%; Time elapsed: 7h28m14.503955704s
2024-10-09 14:23:17.452 -------- Chain index validation progress: 73.33%; Time elapsed: 7h48m41.253678763s
2024-10-09 14:42:55.491 -------- Chain index validation progress: 76.67%; Time elapsed: 8h8m19.292820409s
2024-10-09 15:05:11.490 -------- Chain index validation progress: 80.00%; Time elapsed: 8h30m35.292191527s
2024-10-09 15:27:14.396 -------- Chain index validation progress: 83.33%; Time elapsed: 8h52m38.197724796s
2024-10-09 15:49:58.772 -------- Chain index validation progress: 86.67%; Time elapsed: 9h15m22.573845885s
2024-10-09 16:12:19.897 -------- Chain index validation progress: 90.00%; Time elapsed: 9h37m43.698457415s
2024-10-09 16:33:45.127 -------- Chain index validation progress: 93.33%; Time elapsed: 9h59m8.929105029s
2024-10-09 16:56:38.008 -------- Chain index validation progress: 96.67%; Time elapsed: 10h22m1.809325232s
2024-10-09 17:19:30.228 -------- Chain index validation progress: 100.00%; Time elapsed: 10h44m54.030102146s
2024-10-09 17:19:30.354 -------- Chain index validation progress: 100.00%; Time elapsed: 10h44m54.155308084s
3) Backfilling 1 month of epochs mid-way between FEVM launch and the current chain head. Takes ~13 hours
2024-10-09 18:06:50.812 starting chainindex validation; from epoch: 3511567; to epoch: 3425167; backfill: true; log-good: false
2024-10-09 18:22:00.482 -------- Chain index validation progress: 3.33%; Time elapsed: 15m9.670482824s
2024-10-09 18:35:25.365 -------- Chain index validation progress: 6.67%; Time elapsed: 28m34.553606048s
2024-10-09 18:48:19.165 -------- Chain index validation progress: 10.00%; Time elapsed: 41m28.353796507s
2024-10-09 19:01:29.618 -------- Chain index validation progress: 13.33%; Time elapsed: 54m38.806024773s
2024-10-09 19:15:12.071 -------- Chain index validation progress: 16.67%; Time elapsed: 1h8m21.259877238s
2024-10-09 19:30:44.968 -------- Chain index validation progress: 20.00%; Time elapsed: 1h23m54.15652168s
2024-10-09 19:50:59.944 -------- Chain index validation progress: 23.33%; Time elapsed: 1h44m9.132300745s
2024-10-09 20:19:22.942 -------- Chain index validation progress: 26.67%; Time elapsed: 2h12m32.130043369s
2024-10-09 20:52:27.399 -------- Chain index validation progress: 30.00%; Time elapsed: 2h45m36.587897912s
2024-10-09 21:20:40.064 -------- Chain index validation progress: 33.33%; Time elapsed: 3h13m49.25204028s
2024-10-09 21:50:49.975 -------- Chain index validation progress: 36.67%; Time elapsed: 3h43m59.162984189s
2024-10-09 22:18:22.220 -------- Chain index validation progress: 40.00%; Time elapsed: 4h11m31.408482377s
2024-10-09 22:45:28.032 -------- Chain index validation progress: 43.33%; Time elapsed: 4h38m37.22010544s
2024-10-09 23:12:16.162 -------- Chain index validation progress: 46.67%; Time elapsed: 5h5m25.350077042s
2024-10-09 23:39:37.234 -------- Chain index validation progress: 50.00%; Time elapsed: 5h32m46.422173688s
2024-10-10 00:10:51.416 -------- Chain index validation progress: 53.33%; Time elapsed: 6h4m0.604601922s
2024-10-10 00:46:44.348 -------- Chain index validation progress: 56.67%; Time elapsed: 6h39m53.536003528s
2024-10-10 01:31:14.595 -------- Chain index validation progress: 60.00%; Time elapsed: 7h24m23.783330796s
2024-10-10 04:05:18.792 -------- Chain index validation progress: 63.33%; Time elapsed: 9h58m27.980538058s
2024-10-10 04:25:13.568 -------- Chain index validation progress: 66.67%; Time elapsed: 10h18m22.756382023s
2024-10-10 04:45:33.326 -------- Chain index validation progress: 70.00%; Time elapsed: 10h38m42.514054977s
2024-10-10 05:05:31.425 -------- Chain index validation progress: 73.33%; Time elapsed: 10h58m40.613381271s
2024-10-10 05:25:52.663 -------- Chain index validation progress: 76.67%; Time elapsed: 11h19m1.850925885s
2024-10-10 05:45:20.150 -------- Chain index validation progress: 80.00%; Time elapsed: 11h38m29.338635039s
2024-10-10 06:05:00.198 -------- Chain index validation progress: 83.33%; Time elapsed: 11h58m9.386637783s
2024-10-10 06:24:40.726 -------- Chain index validation progress: 86.67%; Time elapsed: 12h17m49.91455223s
2024-10-10 06:43:07.535 -------- Chain index validation progress: 90.00%; Time elapsed: 12h36m16.723633668s
2024-10-10 07:00:46.843 -------- Chain index validation progress: 93.33%; Time elapsed: 12h53m56.031377256s
2024-10-10 07:20:33.779 -------- Chain index validation progress: 96.67%; Time elapsed: 13h13m42.967073078s
2024-10-10 07:38:29.397 -------- Chain index validation progress: 100.00%; Time elapsed: 13h31m38.585839486s
2024-10-10 07:38:29.963 -------- Chain index validation progress: 100.00%; Time elapsed: 13h31m39.151568536s
I am now running the index "doctor"/validation on these to sanity check that the backfilled data is in line with the chain state.
@aarshkshah1992 : can we get final numbers on chainindex.db size for the full archival node? I know there were some numbers here, but I'm not sure how many tipsets that is and I'd also like to get a larger time range. I want to be able to make a statement like "As of 202410, ChainIndexer will accumulate approximately XMiB per day of data, or XGiB per month" in https://github.com/filecoin-project/lotus/pull/12600
@aarshkshah1992 : can we get final numbers on chainindex.db size for the full archival node? I know there were some numbers here, but I'm not sure how many tipsets that is and I'd also like to get a larger time range. I want to be able to make a statement like "As of 202410, ChainIndexer will accumulate approximately XMiB per day of data, or XGiB per month" in #12600
I'm seeing our docs already had a statement that "The ChainIndex will consume ~10GB of storage per month of tipsets (e.g., ~86400 epochs)". I guess that's all I need but it would be good to have an official record of it in here like you have with backfill times in https://github.com/filecoin-project/lotus/issues/12453#issuecomment-2405306468
Would love @eshon and @jennijuju to chime in if there are more.
Talked with Eva and the summary (in notion) is shared with the team
@BigLep We have yet to index the entire history all the way upto FEVM launch. We were waiting on the reviews to land/get addressed so we can be sure that we're using the same indexing code as users.
Looks like the PR will be ready tomorrow (all reviews will have been addressed) -> will then kick-off an indexing of the entire state and also get all the numbers you need here.
@BigLep
The ChainIndex will consume ~10GB of storage per month of tipsets (e.g., ~86400 epochs)
That does not sound correct. Where did you get it from ? Please can we wait on the next round of archival node testing to get the final numbers ? I'll make sure to document them here once we have them.
@aarshkshah1992
The ChainIndex will consume ~10GB of storage per month of tipsets (e.g., ~86400 epochs)
That does not sound correct. Where did you get it from ? Please can we wait on the next round of archival node testing to get the final numbers ? I'll make sure to document them here once we have them.
Ack, good to know. I can't recall / find where I got these numbers from. I was surprised to see them, so maybe I put them in as fillers. I don't remember. Anyways, I will put X placeholders for now and we'll update once official results have been published here.
@BigLep
Please see https://filecoinproject.slack.com/archives/CP50PPW2X/p1729413621133599.
~10G growth in the Index DB size per month is actually correct.
The ChainIndexer
PR is now merged. Keeping this issue open till RPC providers upgrade and finish backfilling the Index.
Summary
This issue is for the implementation of a new
ChainIndexer
in Lotus that will replace and subsume the existingMsgIndex
,EventsIndex
, andEthTxHashIndex
, which are currently fragmented across multiple databases and have several known issues documented in filecoin-project/lotus#12293.Key Features
The
ChainIndexer
offers the following key features:Note: while the ChainIndexer is primarily focused on events and ETH RPC usecases, it also benefits pre-FEVM as well. For example,
StateSearchMsg
and its various dependents will now have a shortcut to find the message.Implementation Items
Switch RPC APIs to use the Chain Index
ChainIndexer
instead of theMsgIndex
,EthTxHashIndex
andEventsIndex
.EventFilterManager
will read events from theChainIndexer
and prefill all registered filters rather than depending on the Indexer to do the pre-filling of filters.ChainIndexer
will listen to Mpool message addition updates to index the corresponding ETH Tx Hash. TheEthTxHashManager
will no longer be used for this.Read APIs Should Account for the Async Nature of Indexing
T
only indexes events inT-1
because of deferred execution.ETH RPC APIs Should Only Expose Executed Tipsets and Messages
T
are executed in tipsetT + 1
.T
are also executed in tipsetT
.Removing Re-orged Tipsets That Are No Longer Part of the Canonical Chain
ChainIndexer
will periodically prune all permanently re-orged/reverted tipsets from the index. It can do this by simply pruning all tipsets at a height less than(current head - finality policy - some buffer)
.Garbage Collection
ChainIndexer
can perform periodic GC based on this configuration.ChainIndexer
because of the use ofFOREIGN KEY ON CASCADE DELETES
, as described in SQLite Foreign Keys.Snapshot Hydration
Automated Backfilling
ChainIndex
for which the corresponding state exists in the statestore.Observer
with that tipset as the current head.Observer
.ChainIndexer
will observe the(Apply, Revert)
path between its last non-reverted indexed tipset and the current heaviest tipset in the chainstore before processing real-time updates, effectively performing automated backfilling.Simplify Indexing Config
Migration from Old Indices to the New ChainIndex
lotus-shed
utility that allows users to migrate existing indices to the newChainIndexer
database. This command should only be executed when the Lotus node is offline to ensure data consistency and avoid potential conflicts.ChainIndexer
. This approach offers several benefits: