bnb-chain / reth

BSC and opBNB client based on the Reth fork
Apache License 2.0
73 stars 19 forks source link

`MerkleExecute` is way too slow compared to erigon #174

Closed wtdcode closed 1 month ago

wtdcode commented 1 month ago

Describe the bug

I believe the implementation of MerkleExecute has some issues. I started both bsc-erigon (e2) and bsc-reth while bsc-erigon lagged more than 40k blocks and bsc-reth lagged 20k blocks. After 2-3 days, bsc-erigon always sync-ed to the latest block while bsc-reth still gets stuck at MerkleExecute.

Inspecting logs, bsc-reth roughly MerkleExecute 20k blocks on my machine per 12 hours (roughly 1 block / 2 seconds), which is not too much faster than current bsc mainnet block speed (roughly 1block / 3 seconds). This is even way much slower than Execution stage of bsc-reth. This makes bsc-reth node almost always lagging. Meanwhile, the CPU (less than 10% CPU), memory (less than 16G) and disk utilization (less than 3000 iops) are all rather low. Note the disks are exactly same (but different arrays to avoid competing) while I allocated 8x more CPU/memory resources (32C/256G) to bsc-reth compared to bsc-erigon (4C32G).

Enabling debug logs suggest that during MerkleExecute, bsc-reth tries to fetch headers from consensus engine instead of DB. I guess this is the root cause of the slowdown but need confirmation from developers. I have no idea why calculating merkle tree needs information from parlia. I can also help draft a PR if I can understand the reason here.

Steps to reproduce

Download the snapshots and spin up a mainnet bsc-reth node.

Node logs

reth-1  | 2024-10-28T07:58:01.618747Z DEBUG consensus::engine: Pipeline sync progress is inconsistent first_stage_checkpoint=43502607 inconsistent_stage_id=PruneSenderRecovery inconsistent_stage_checkpoint=42993696
reth-1  | 2024-10-28T07:58:01.618844Z  INFO reth::cli: Starting consensus engine
reth-1  | 2024-10-28T07:58:01.618905Z DEBUG reth::cli: start backfill sync initial_target=0xbab7842730a97b20f9f2a28b17cae5b6bcc799e66dd50e40a677e54564326a57
reth-1  | 2024-10-28T07:58:01.619085Z DEBUG reth::cli: Event: BackfillSyncStarted
reth-1  | 2024-10-28T07:58:01.619125Z DEBUG engine::tree: received new engine message msg=Event(BackfillSyncStarted)
reth-1  | 2024-10-28T07:58:01.619170Z DEBUG engine::tree: received backfill sync started event
reth-1  | 2024-10-28T07:58:01.619870Z DEBUG ChainOrchestrator::poll: pruner: Pruner started tip_block_number=43502607
reth-1  | 2024-10-28T07:58:01.619931Z DEBUG ChainOrchestrator::poll: pruner: Segment pruning started segment=Headers purpose=StaticFile to_block=43502607 prune_mode=Before(43502608)
reth-1  | 2024-10-28T07:58:01.619967Z DEBUG ChainOrchestrator::poll: pruner: Segment pruning finished segment=Headers purpose=StaticFile to_block=43502607 prune_mode=Before(43502608) segment_output.pruned=0
reth-1  | 2024-10-28T07:58:01.619988Z DEBUG ChainOrchestrator::poll: pruner: Segment pruning started segment=Transactions purpose=StaticFile to_block=43502607 prune_mode=Before(43502608)
reth-1  | 2024-10-28T07:58:01.620001Z DEBUG ChainOrchestrator::poll: pruner: Segment pruning finished segment=Transactions purpose=StaticFile to_block=43502607 prune_mode=Before(43502608) segment_output.pruned=0
reth-1  | 2024-10-28T07:58:01.620014Z DEBUG ChainOrchestrator::poll: pruner: Segment pruning started segment=Receipts purpose=StaticFile to_block=43502607 prune_mode=Before(43502608)
reth-1  | 2024-10-28T07:58:01.620024Z DEBUG ChainOrchestrator::poll: pruner: Segment pruning finished segment=Receipts purpose=StaticFile to_block=43502607 prune_mode=Before(43502608) segment_output.pruned=0
reth-1  | 2024-10-28T07:58:01.620033Z DEBUG ChainOrchestrator::poll: pruner: Segment pruning started segment=Sidecars purpose=StaticFile to_block=43502607 prune_mode=Before(43502608)
reth-1  | 2024-10-28T07:58:01.620049Z DEBUG ChainOrchestrator::poll: pruner: Segment pruning finished segment=Sidecars purpose=StaticFile to_block=43502607 prune_mode=Before(43502608) segment_output.pruned=0
reth-1  | 2024-10-28T07:58:01.620069Z DEBUG ChainOrchestrator::poll: pruner: Pruner interrupted and has more data to prune tip_block_number=43502607 elapsed=137.922µs deleted_entries=0 limiter=PruneLimiter { deleted_entries_limit: Some(PruneDeletedEntriesLimit { limit: 18446744073709551615, deleted: 0 }), time_limit: None } output=PrunerOutput { progress: HasMoreData(Unknown), segments: [(Headers, SegmentOutput { progress: HasMoreData(Unknown), pruned: 0, checkpoint: Some(SegmentOutputCheckpoint { block_number: None, tx_number: None }) }), (Transactions, SegmentOutput { progress: Finished, pruned: 0, checkpoint: None }), (Receipts, SegmentOutput { progress: Finished, pruned: 0, checkpoint: None }), (Sidecars, SegmentOutput { progress: HasMoreData(Unknown), pruned: 0, checkpoint: Some(SegmentOutputCheckpoint { block_number: None, tx_number: None }) })] } stats=[]
reth-1  | 2024-10-28T07:58:01.620145Z DEBUG ChainOrchestrator::poll: storage::db::mdbx: Commit total_duration=39.81µs commit_latency=Some(CommitLatency(MDBX_commit_latency { preparation: 0, gc_wallclock: 0, audit: 0, write: 0, sync: 0, ending: 0, whole: 1, gc_cputime: 0, gc_prof: MDBX_commit_latency__bindgen_ty_1 { wloops: 0, coalescences: 0, wipes: 0, flushes: 0, kicks: 0, work_counter: 0, work_rtime_monotonic: 0, work_xtime_cpu: 0, work_rsteps: 0, work_xpages: 0, work_majflt: 0, self_counter: 0, self_rtime_monotonic: 0, self_xtime_cpu: 0, self_rsteps: 0, self_xpages: 0, self_majflt: 0 } })) is_read_only=false
reth-1  | 2024-10-28T07:58:01.620232Z  INFO ChainOrchestrator::poll: sync::stages::headers: Target block already reached checkpoint=43502607 target=Hash(0xbab7842730a97b20f9f2a28b17cae5b6bcc799e66dd50e40a677e54564326a57)
reth-1  | 2024-10-28T07:58:01.620244Z  INFO reth_node_events::node: Preparing stage pipeline_stages=1/12 stage=Headers checkpoint=43502607 target=None
reth-1  | 2024-10-28T07:58:01.620312Z  INFO reth_node_events::node: Executing stage pipeline_stages=1/12 stage=Headers checkpoint=43502607 target=None
reth-1  | 2024-10-28T07:58:01.620331Z  INFO reth_node_events::node: Finished stage pipeline_stages=1/12 stage=Headers checkpoint=43502607 target=None stage_progress=100.00%
reth-1  | 2024-10-28T07:58:01.620337Z DEBUG ChainOrchestrator::poll: storage::db::mdbx: Commit total_duration=33.192µs commit_latency=Some(CommitLatency(MDBX_commit_latency { preparation: 0, gc_wallclock: 0, audit: 0, write: 0, sync: 0, ending: 0, whole: 1, gc_cputime: 0, gc_prof: MDBX_commit_latency__bindgen_ty_1 { wloops: 0, coalescences: 0, wipes: 0, flushes: 0, kicks: 0, work_counter: 0, work_rtime_monotonic: 0, work_xtime_cpu: 0, work_rsteps: 0, work_xpages: 0, work_majflt: 0, self_counter: 0, self_rtime_monotonic: 0, self_xtime_cpu: 0, self_rsteps: 0, self_xpages: 0, self_majflt: 0 } })) is_read_only=false
reth-1  | 2024-10-28T07:58:01.620399Z  INFO reth_node_events::node: Preparing stage pipeline_stages=2/12 stage=Bodies checkpoint=43502607 target=43502607
reth-1  | 2024-10-28T07:58:01.620424Z  INFO reth_node_events::node: Executing stage pipeline_stages=2/12 stage=Bodies checkpoint=43502607 target=43502607
reth-1  | 2024-10-28T07:58:01.620434Z  INFO reth_node_events::node: Finished stage pipeline_stages=2/12 stage=Bodies checkpoint=43502607 target=43502607 stage_progress=100.00%
reth-1  | 2024-10-28T07:58:01.620444Z DEBUG ChainOrchestrator::poll: storage::db::mdbx: Commit total_duration=36.029µs commit_latency=Some(CommitLatency(MDBX_commit_latency { preparation: 0, gc_wallclock: 0, audit: 0, write: 0, sync: 0, ending: 0, whole: 1, gc_cputime: 0, gc_prof: MDBX_commit_latency__bindgen_ty_1 { wloops: 0, coalescences: 0, wipes: 0, flushes: 0, kicks: 0, work_counter: 0, work_rtime_monotonic: 0, work_xtime_cpu: 0, work_rsteps: 0, work_xpages: 0, work_majflt: 0, self_counter: 0, self_rtime_monotonic: 0, self_xtime_cpu: 0, self_rsteps: 0, self_xpages: 0, self_majflt: 0 } })) is_read_only=false
reth-1  | 2024-10-28T07:58:01.620491Z  INFO reth_node_events::node: Preparing stage pipeline_stages=3/12 stage=SenderRecovery checkpoint=43502607 target=43502607
reth-1  | 2024-10-28T07:58:01.620526Z  INFO reth_node_events::node: Executing stage pipeline_stages=3/12 stage=SenderRecovery checkpoint=43502607 target=43502607
reth-1  | 2024-10-28T07:58:01.620541Z  INFO reth_node_events::node: Finished stage pipeline_stages=3/12 stage=SenderRecovery checkpoint=43502607 target=43502607 stage_progress=100.00%
reth-1  | 2024-10-28T07:58:01.620552Z DEBUG ChainOrchestrator::poll: storage::db::mdbx: Commit total_duration=32.26µs commit_latency=Some(CommitLatency(MDBX_commit_latency { preparation: 0, gc_wallclock: 0, audit: 0, write: 0, sync: 0, ending: 0, whole: 1, gc_cputime: 0, gc_prof: MDBX_commit_latency__bindgen_ty_1 { wloops: 0, coalescences: 0, wipes: 0, flushes: 0, kicks: 0, work_counter: 0, work_rtime_monotonic: 0, work_xtime_cpu: 0, work_rsteps: 0, work_xpages: 0, work_majflt: 0, self_counter: 0, self_rtime_monotonic: 0, self_xtime_cpu: 0, self_rsteps: 0, self_xpages: 0, self_majflt: 0 } })) is_read_only=false
reth-1  | 2024-10-28T07:58:01.620615Z  INFO reth_node_events::node: Preparing stage pipeline_stages=4/12 stage=Execution checkpoint=43502607 target=43502607
reth-1  | 2024-10-28T07:58:01.620632Z  INFO reth_node_events::node: Executing stage pipeline_stages=4/12 stage=Execution checkpoint=43502607 target=43502607
reth-1  | 2024-10-28T07:58:01.620641Z  INFO reth_node_events::node: Finished stage pipeline_stages=4/12 stage=Execution checkpoint=43502607 target=43502607 stage_progress=99.99%
reth-1  | 2024-10-28T07:58:01.620675Z DEBUG ChainOrchestrator::poll: storage::db::mdbx: Commit total_duration=35.411µs commit_latency=Some(CommitLatency(MDBX_commit_latency { preparation: 0, gc_wallclock: 0, audit: 0, write: 0, sync: 0, ending: 0, whole: 1, gc_cputime: 0, gc_prof: MDBX_commit_latency__bindgen_ty_1 { wloops: 0, coalescences: 0, wipes: 0, flushes: 0, kicks: 0, work_counter: 0, work_rtime_monotonic: 0, work_xtime_cpu: 0, work_rsteps: 0, work_xpages: 0, work_majflt: 0, self_counter: 0, self_rtime_monotonic: 0, self_xtime_cpu: 0, self_rsteps: 0, self_xpages: 0, self_majflt: 0 } })) is_read_only=false
reth-1  | 2024-10-28T07:58:01.620740Z  INFO reth_node_events::node: Preparing stage pipeline_stages=5/12 stage=MerkleUnwind checkpoint=43502607 target=43502607
reth-1  | 2024-10-28T07:58:01.620766Z  INFO ChainOrchestrator::poll: sync::stages::merkle::unwind: Stage is always skipped
reth-1  | 2024-10-28T07:58:01.620774Z  INFO reth_node_events::node: Executing stage pipeline_stages=5/12 stage=MerkleUnwind checkpoint=43502607 target=43502607
reth-1  | 2024-10-28T07:58:01.620782Z  INFO reth_node_events::node: Finished stage pipeline_stages=5/12 stage=MerkleUnwind checkpoint=43502607 target=43502607
reth-1  | 2024-10-28T07:58:01.620812Z DEBUG ChainOrchestrator::poll: storage::db::mdbx: Commit total_duration=34.721µs commit_latency=Some(CommitLatency(MDBX_commit_latency { preparation: 0, gc_wallclock: 0, audit: 0, write: 0, sync: 0, ending: 0, whole: 1, gc_cputime: 0, gc_prof: MDBX_commit_latency__bindgen_ty_1 { wloops: 0, coalescences: 0, wipes: 0, flushes: 0, kicks: 0, work_counter: 0, work_rtime_monotonic: 0, work_xtime_cpu: 0, work_rsteps: 0, work_xpages: 0, work_majflt: 0, self_counter: 0, self_rtime_monotonic: 0, self_xtime_cpu: 0, self_rsteps: 0, self_xpages: 0, self_majflt: 0 } })) is_read_only=false
reth-1  | 2024-10-28T07:58:01.620871Z  INFO reth_node_events::node: Preparing stage pipeline_stages=6/12 stage=AccountHashing checkpoint=43502607 target=43502607
reth-1  | 2024-10-28T07:58:01.620902Z  INFO reth_node_events::node: Executing stage pipeline_stages=6/12 stage=AccountHashing checkpoint=43502607 target=43502607
reth-1  | 2024-10-28T07:58:01.620913Z  INFO reth_node_events::node: Finished stage pipeline_stages=6/12 stage=AccountHashing checkpoint=43502607 target=43502607 stage_progress=100.00%
reth-1  | 2024-10-28T07:58:01.620935Z DEBUG ChainOrchestrator::poll: storage::db::mdbx: Commit total_duration=29.259µs commit_latency=Some(CommitLatency(MDBX_commit_latency { preparation: 0, gc_wallclock: 0, audit: 0, write: 0, sync: 0, ending: 0, whole: 1, gc_cputime: 0, gc_prof: MDBX_commit_latency__bindgen_ty_1 { wloops: 0, coalescences: 0, wipes: 0, flushes: 0, kicks: 0, work_counter: 0, work_rtime_monotonic: 0, work_xtime_cpu: 0, work_rsteps: 0, work_xpages: 0, work_majflt: 0, self_counter: 0, self_rtime_monotonic: 0, self_xtime_cpu: 0, self_rsteps: 0, self_xpages: 0, self_majflt: 0 } })) is_read_only=false
reth-1  | 2024-10-28T07:58:01.620997Z  INFO reth_node_events::node: Preparing stage pipeline_stages=7/12 stage=StorageHashing checkpoint=43502607 target=43502607
reth-1  | 2024-10-28T07:58:01.621030Z  INFO reth_node_events::node: Executing stage pipeline_stages=7/12 stage=StorageHashing checkpoint=43502607 target=43502607
reth-1  | 2024-10-28T07:58:01.621045Z  INFO reth_node_events::node: Finished stage pipeline_stages=7/12 stage=StorageHashing checkpoint=43502607 target=43502607 stage_progress=100.00%
reth-1  | 2024-10-28T07:58:01.621061Z DEBUG ChainOrchestrator::poll: storage::db::mdbx: Commit total_duration=34.851µs commit_latency=Some(CommitLatency(MDBX_commit_latency { preparation: 0, gc_wallclock: 0, audit: 0, write: 0, sync: 0, ending: 0, whole: 1, gc_cputime: 0, gc_prof: MDBX_commit_latency__bindgen_ty_1 { wloops: 0, coalescences: 0, wipes: 0, flushes: 0, kicks: 0, work_counter: 0, work_rtime_monotonic: 0, work_xtime_cpu: 0, work_rsteps: 0, work_xpages: 0, work_majflt: 0, self_counter: 0, self_rtime_monotonic: 0, self_xtime_cpu: 0, self_rsteps: 0, self_xpages: 0, self_majflt: 0 } })) is_read_only=false
reth-1  | 2024-10-28T07:58:01.621124Z  INFO reth_node_events::node: Preparing stage pipeline_stages=8/12 stage=MerkleExecute checkpoint=43483190 target=43502607
reth-1  | 2024-10-28T07:58:01.621167Z  INFO reth_node_events::node: Executing stage pipeline_stages=8/12 stage=MerkleExecute checkpoint=43483190 target=43502607
reth-1  | 2024-10-28T07:58:01.621177Z DEBUG ChainOrchestrator::poll: sync::stages::merkle::exec: Continuing inner merkle checkpoint current=43483190 target=43502607 last_account_key=0x3620df6fc92405b5d17224648c3505bed9adf8fa828f4c6fc9595c52ba411da6
reth-1  | 2024-10-28T07:58:03.406974Z DEBUG ChainOrchestrator::poll: sync::stages::merkle::exec: Saving inner merkle checkpoint last_account_key=0x362ba6fa81d545f2993ec68ef9bc5d65db32c27ed011d7b8aa4ccc35eeb1084b
reth-1  | 2024-10-28T07:58:03.409984Z  INFO reth_node_events::node: Committed stage progress pipeline_stages=8/12 stage=MerkleExecute checkpoint=43483190 target=43502607 stage_progress=17.18%
reth-1  | 2024-10-28T07:58:03.467765Z DEBUG ChainOrchestrator::poll: storage::db::mdbx: Commit total_duration=57.905366ms commit_latency=Some(CommitLatency(MDBX_commit_latency { preparation: 0, gc_wallclock: 11, audit: 0, write: 0, sync: 3773, ending: 0, whole: 3785, gc_cputime: 0, gc_prof: MDBX_commit_latency__bindgen_ty_1 { wloops: 0, coalescences: 0, wipes: 0, flushes: 0, kicks: 0, work_counter: 0, work_rtime_monotonic: 0, work_xtime_cpu: 0, work_rsteps: 0, work_xpages: 0, work_majflt: 0, self_counter: 0, self_rtime_monotonic: 0, self_xtime_cpu: 0, self_rsteps: 0, self_xpages: 0, self_majflt: 0 } })) is_read_only=false
reth-1  | 2024-10-28T07:58:03.467944Z DEBUG ChainOrchestrator::poll: sync::stages::merkle::exec: Continuing inner merkle checkpoint current=43483190 target=43502607 last_account_key=0x362ba6fa81d545f2993ec68ef9bc5d65db32c27ed011d7b8aa4ccc35eeb1084b
reth-1  | 2024-10-28T07:58:03.467944Z  INFO reth_node_events::node: Preparing stage pipeline_stages=8/12 stage=MerkleExecute checkpoint=43483190 target=43502607
reth-1  | 2024-10-28T07:58:03.467982Z  INFO reth_node_events::node: Executing stage pipeline_stages=8/12 stage=MerkleExecute checkpoint=43483190 target=43502607
reth-1  | 2024-10-28T07:58:04.602239Z  INFO reth::cli: Status connected_peers=0 stage=MerkleExecute checkpoint=43483190 target=43502607 stage_progress=17.18%
reth-1  | 2024-10-28T07:58:04.602329Z DEBUG consensus::parlia: Fetching new header block_hash=Number(43483191)
reth-1  | 2024-10-28T07:58:07.604141Z DEBUG consensus::parlia: Fetching new header block_hash=Number(43483191)
reth-1  | 2024-10-28T07:58:10.605473Z DEBUG consensus::parlia: Fetching new header block_hash=Number(43483191)
reth-1  | 2024-10-28T07:58:13.163662Z DEBUG ChainOrchestrator::poll: sync::stages::merkle::exec: Saving inner merkle checkpoint last_account_key=0x362f37ec7a3c2fc0b9bacc06a02524a5f768ec8faee05abe585798660ed89343
reth-1  | 2024-10-28T07:58:13.168547Z  INFO reth_node_events::node: Committed stage progress pipeline_stages=8/12 stage=MerkleExecute checkpoint=43483190 target=43502607 stage_progress=17.20% stage_eta=11h 40m 9s
reth-1  | 2024-10-28T07:58:13.261961Z DEBUG ChainOrchestrator::poll: storage::db::mdbx: Commit total_duration=93.471609ms commit_latency=Some(CommitLatency(MDBX_commit_latency { preparation: 0, gc_wallclock: 10, audit: 0, write: 0, sync: 6108, ending: 0, whole: 6119, gc_cputime: 0, gc_prof: MDBX_commit_latency__bindgen_ty_1 { wloops: 0, coalescences: 0, wipes: 0, flushes: 0, kicks: 0, work_counter: 0, work_rtime_monotonic: 0, work_xtime_cpu: 0, work_rsteps: 0, work_xpages: 0, work_majflt: 0, self_counter: 0, self_rtime_monotonic: 0, self_xtime_cpu: 0, self_rsteps: 0, self_xpages: 0, self_majflt: 0 } })) is_read_only=false
reth-1  | 2024-10-28T07:58:13.262102Z  INFO reth_node_events::node: Preparing stage pipeline_stages=8/12 stage=MerkleExecute checkpoint=43483190 target=43502607
reth-1  | 2024-10-28T07:58:13.262106Z DEBUG ChainOrchestrator::poll: sync::stages::merkle::exec: Continuing inner merkle checkpoint current=43483190 target=43502607 last_account_key=0x362f37ec7a3c2fc0b9bacc06a02524a5f768ec8faee05abe585798660ed89343
reth-1  | 2024-10-28T07:58:13.262130Z  INFO reth_node_events::node: Executing stage pipeline_stages=8/12 stage=MerkleExecute checkpoint=43483190 target=43502607 stage_eta=11h 40m 9s
reth-1  | 2024-10-28T07:58:13.606733Z DEBUG consensus::parlia: Fetching new header block_hash=Number(43483191)
reth-1  | 2024-10-28T07:58:16.608128Z DEBUG consensus::parlia: Fetching new header block_hash=Number(43483191)
reth-1  | 2024-10-28T07:58:19.610110Z DEBUG consensus::parlia: Fetching new header block_hash=Number(43483191)
reth-1  | 2024-10-28T07:58:21.474396Z DEBUG ChainOrchestrator::poll: sync::stages::merkle::exec: Saving inner merkle checkpoint last_account_key=0x363637fbaabf5ec7895714a132e4bb634afcdb2aa83d342216995488bebf03db
reth-1  | 2024-10-28T07:58:21.481503Z  INFO reth_node_events::node: Committed stage progress pipeline_stages=8/12 stage=MerkleExecute checkpoint=43483190 target=43502607 stage_progress=17.22% stage_eta=9h 3m 32s
reth-1  | 2024-10-28T07:58:21.561822Z DEBUG ChainOrchestrator::poll: storage::db::mdbx: Commit total_duration=80.433583ms commit_latency=Some(CommitLatency(MDBX_commit_latency { preparation: 0, gc_wallclock: 7, audit: 0, write: 0, sync: 5256, ending: 0, whole: 5264, gc_cputime: 0, gc_prof: MDBX_commit_latency__bindgen_ty_1 { wloops: 0, coalescences: 0, wipes: 0, flushes: 0, kicks: 0, work_counter: 0, work_rtime_monotonic: 0, work_xtime_cpu: 0, work_rsteps: 0, work_xpages: 0, work_majflt: 0, self_counter: 0, self_rtime_monotonic: 0, self_xtime_cpu: 0, self_rsteps: 0, self_xpages: 0, self_majflt: 0 } })) is_read_only=false
reth-1  | 2024-10-28T07:58:21.561986Z DEBUG ChainOrchestrator::poll: sync::stages::merkle::exec: Continuing inner merkle checkpoint current=43483190 target=43502607 last_account_key=0x363637fbaabf5ec7895714a132e4bb634afcdb2aa83d342216995488bebf03db
reth-1  | 2024-10-28T07:58:21.562012Z  INFO reth_node_events::node: Preparing stage pipeline_stages=8/12 stage=MerkleExecute checkpoint=43483190 target=43502607
reth-1  | 2024-10-28T07:58:21.562042Z  INFO reth_node_events::node: Executing stage pipeline_stages=8/12 stage=MerkleExecute checkpoint=43483190 target=43502607 stage_eta=9h 3m 32s
reth-1  | 2024-10-28T07:58:22.612054Z DEBUG consensus::parlia: Fetching new header block_hash=Number(43483191)

Platform(s)

Linux (x86)

What version/commit are you on?

1.0.6-dev (6eddfde1)

What database version are you on?

-

Which chain / network are you on?

bsc

What type of node are you running?

Archive (default)

What prune config do you use, if any?

No response

If you've built Reth from source, provide the full command you used

No response

Code of Conduct

unclezoro commented 1 month ago

thanks for reporting this. The data seems not match with our testing, anyway we will try to reproduce this.

wtdcode commented 1 month ago

thanks for reporting this. The data seems not match with our testing, anyway we will try to reproduce this.

An update: Adding --optimize.skip-state-root-validation makes bsc-reth catch up the head in 2 hour and 30 minutes by executing 20k blocks.

pythonberg1997 commented 1 month ago
  1. consensus::parlia: Fetching new header block_hash=Number(43483191) is invalid log. The parlia engine is on standby when the node is doing stage sync.
  2. --optimize.skip-state-root-validation is for fast-node. It will skip state root validation as it says.
  3. You can try to modify the reth.toml file in the datadir.
    [stages.merkle]
    clean_threshold

    Change it from default 5000 to 20000 or more.

pythonberg1997 commented 1 month ago

Reth will try to rebuild the merkle tree if the step size of stage sync is bigger than clean_threshold. In that case, it will take ~10-20 hours to rebuild depending on machine spec. This processing time is not related to the synchronization height(20k in your case), but rather to the total chain height. If the step size is smaller than clean_threshold, reth will try to do state root calc in incremental way, which is far more fast and related to the synchronization height.

wtdcode commented 1 month ago

Reth will try to rebuild the merkle tree if the step size of stage sync is bigger than clean_threshold. In that case, it will take ~10-20 hours to rebuild depending on machine spec. This processing time is not related to the synchronization height(20k in your case), but rather to the total chain height. If the step size is smaller than clean_threshold, reth will try to do state root calc in incremental way, which is far more fast and related to the synchronization height.

Nice explanation. So I shall have a higher clean_threshold for account_hashing, storage_hashing and merkel stages to have inrcremental building?

wtdcode commented 1 month ago

Seems related to https://github.com/paradigmxyz/reth/pull/7364

Reth reduces the threshold to 5000 to avoid OOM. I can adjust this to much higher I believe.

Maybe we should adjust this for bsc-reth?

pythonberg1997 commented 1 month ago

Reth will try to rebuild the merkle tree if the step size of stage sync is bigger than clean_threshold. In that case, it will take ~10-20 hours to rebuild depending on machine spec. This processing time is not related to the synchronization height(20k in your case), but rather to the total chain height. If the step size is smaller than clean_threshold, reth will try to do state root calc in incremental way, which is far more fast and related to the synchronization height.

Nice explanation. So I shall have a higher clean_threshold for account_hashing, storage_hashing and merkel stages to have inrcremental building?

Just merkle is enough in my experience. And be careful of OOM.

wtdcode commented 1 month ago

Reth will try to rebuild the merkle tree if the step size of stage sync is bigger than clean_threshold. In that case, it will take ~10-20 hours to rebuild depending on machine spec. This processing time is not related to the synchronization height(20k in your case), but rather to the total chain height. If the step size is smaller than clean_threshold, reth will try to do state root calc in incremental way, which is far more fast and related to the synchronization height.

Nice explanation. So I shall have a higher clean_threshold for account_hashing, storage_hashing and merkel stages to have inrcremental building?

Just merkle is enough in my experience. And be careful of OOM.

Thanks for hints!

I have adjusted the clean_threshold to 250k in my case and allocated 512GB memory for bsc reth. Unfortunately, I truncated the tables when switching to fast-node. I think I need to re-download the snapshot (if that is faster). I will report if this works for my side in the following weeks.