lambdaclass / lambda_ethereum_consensus

Elixir implementation of an Ethereum consensus client, which offers high reliance & fault tolerance
Apache License 2.0
101 stars 33 forks source link

Root not found in tree error hangs the node (Sepolia long run) #1309

Closed rodrigo-o closed 1 month ago

rodrigo-o commented 1 month ago

This is something that could happen along the node executions but is easily reproducible on the initial setup. After the initial sync and some time running (a couple of minutes) an initial pruning kicks in and for some reason it generates a slot in the future error, here is an example log:

INFO 17:43:12.001 [StateDb] Pruning started. slot=5963552
INFO 17:43:12.001 [BlobDb] Pruning started. slot=5832480
INFO 17:43:12.001 [BlockDb] Pruning started. slot=5963552
INFO 17:43:12.001 [BlobDb] Pruning finished. 0 blobs removed. 
INFO 17:43:12.003 [Libp2p] Slot transition slot=5963616
INFO 17:43:12.009 [BlockDb] Pruning finished. 52 blocks removed. 
INFO 17:43:12.019 [StateDb] Pruning finished. 52 states removed. 
INFO 17:43:12.684 [Gossip] Block received, block.slot: 5963616. 
INFO 17:43:24.002 [Libp2p] Slot transition slot=5963617
INFO 17:43:24.216 [Gossip] Block received, block.slot: 5963617. 
INFO 17:43:29.977 [Fork choice] Adding new block slot=5963613 root=0x3de..fc05
INFO 17:43:30.204 [Fork choice] Block processed. Recomputing head. 
ERROR 17:43:30.204 GenServer LambdaEthereumConsensus.Libp2pPort terminating
** (RuntimeError) Parent 14C35B27D611342466C990ABE85147AFCDA13DBE2D5C05C4B3155309CF94FFFD not found in tree
    (lambda_ethereum_consensus 0.1.0) lib/lambda_ethereum_consensus/fork_choice/simple_tree.ex:84: LambdaEthereumConsensus.ForkChoice.Simple.Tree.get_children!/2
    (lambda_ethereum_consensus 0.1.0) lib/types/store.ex:147: Types.Store.get_children/2
    (lambda_ethereum_consensus 0.1.0) lib/lambda_ethereum_consensus/fork_choice/head.ex:90: LambdaEthereumConsensus.ForkChoice.Head.filter_block_tree/4
    (lambda_ethereum_consensus 0.1.0) lib/lambda_ethereum_consensus/fork_choice/head.ex:85: LambdaEthereumConsensus.ForkChoice.Head.get_filtered_block_tree/1
    (lambda_ethereum_consensus 0.1.0) lib/lambda_ethereum_consensus/fork_choice/head.ex:14: LambdaEthereumConsensus.ForkChoice.Head.get_head/1
    (lambda_ethereum_consensus 0.1.0) lib/lambda_ethereum_consensus/fork_choice/fork_choice.ex:269: LambdaEthereumConsensus.ForkChoice.recompute_head/1
    (lambda_ethereum_consensus 0.1.0) lib/lambda_ethereum_consensus/fork_choice/fork_choice.ex:62: anonymous fn/1 in LambdaEthereumConsensus.ForkChoice.on_block/2
    (telemetry 1.3.0) /home/admin/lambda_ethereum_consensus/deps/telemetry/src/telemetry.erl:324: :telemetry.span/3
Last message: {#Port<0.16>, {:data, <<50, 145, 166, 1, 10, 36, 55, 100, 49, 52, 49, 57, 53, 49, 45, 97, 100, 56, 50, 45, 52, 50, 101, 102, 45, 97, 102, 102, 101, 45, 48, 57, 102, 54, 50, 55, 101, 97, 57, 56, 102, 101, 16, 1, 26, 229, ...>>}} 
INFO 17:43:30.227 [Optimistic Sync] Waiting 10.0 seconds to discover some peers before requesting blocks. 
INFO 17:43:30.227 [Fork choice] Adding new block slot=5963613 root=0x3de..fc05
ERROR 17:43:30.280 [Fork choice] Failed to add block: block is from the future slot=5963613 root=0x3de..fc05
ERROR 17:43:30.280 [PendingBlocks] Saving block as invalid block is from the future slot=5963613 root=0x3de..fc05
INFO 17:43:36.001 [Libp2p] Slot transition slot=5963618
INFO 17:43:40.228 [Optimistic sync] Performing optimistic sync between slots 5963489 and 5963618, for a total of 130 slots. 

This happens to kick in sync again and and may trigger #1308 issue.

rodrigo-o commented 1 month ago

This was not related to pruning as I originally thought. Here is a particular example where pruning wasn't involved. Something to be mindful about is that the error was pointing to a block that we added at the start of the following log, and that was correctly processed and not reorged in Sepolia (https://sepolia.etherscan.io/block/6790864#consensusinfo)

INFO 21:08:40.621 [Fork choice] Adding new block slot=6000639 root=0xb3a..dbe9
INFO 21:08:40.775 [Fork choice] Block processed. Recomputing head. 
INFO 21:08:40.785 [Fork choice] Added new block slot=6000639 root=0xb3a..dbe9
INFO 21:08:40.786 [Fork choice] Adding new block slot=6000640 root=0xcae..a8f3
INFO 21:08:41.061 [Fork choice] Block processed. Recomputing head. 
INFO 21:08:41.067 [Fork choice] Added new block slot=6000640 root=0xcae..a8f3
INFO 21:08:41.069 [Fork choice] Adding new block slot=6000641 root=0xe4e..0178
INFO 21:08:41.070 [Block processing] Computing committees for epoch 187520 
INFO 21:08:41.264 [Fork choice] Block processed. Recomputing head. 
INFO 21:08:41.270 [Fork choice] Added new block slot=6000641 root=0xe4e..0178
INFO 21:08:41.270 [Fork choice] Adding new block slot=6000642 root=0x179..fdb5
INFO 21:08:41.430 [Fork choice] Block processed. Recomputing head. 
INFO 21:08:41.434 [Fork choice] Added new block slot=6000642 root=0x179..fdb5

....

INFO 21:14:12.008 [Libp2p] Slot transition slot=6000671
INFO 21:14:12.585 [Gossip] Block received, block.slot: 6000671. 
INFO 21:14:12.596 [Fork choice] Adding new block slot=6000671 root=0x8c6..d57e
INFO 21:14:12.791 [Fork choice] Block processed. Recomputing head. 
INFO 21:14:12.797 [Fork choice] Added new block slot=6000671 root=0x8c6..d57e
INFO 21:14:24.004 [Libp2p] Slot transition slot=6000672
INFO 21:14:27.545 [Gossip] Block received, block.slot: 6000672. 
INFO 21:14:27.718 [Fork choice] Adding new block slot=6000672 root=0x611..070f
INFO 21:14:28.073 [Fork choice] Block processed. Recomputing head. 
ERROR 21:14:28.076 GenServer LambdaEthereumConsensus.Libp2pPort terminating
** (RuntimeError) Parent CAE1F13DBD7FAF2177FDEE036E00B1D59319D380A9DC859478940CB1B0FCA8F3 not found in tree
    (lambda_ethereum_consensus 0.1.0) lib/lambda_ethereum_consensus/fork_choice/simple_tree.ex:84: LambdaEthereumConsensus.ForkChoice.Simple.Tree.get_children!/2
    (lambda_ethereum_consensus 0.1.0) lib/types/store.ex:148: Types.Store.get_children/2
    (lambda_ethereum_consensus 0.1.0) lib/lambda_ethereum_consensus/fork_choice/head.ex:90: LambdaEthereumConsensus.ForkChoice.Head.filter_block_tree/4
    (lambda_ethereum_consensus 0.1.0) lib/lambda_ethereum_consensus/fork_choice/head.ex:85: LambdaEthereumConsensus.ForkChoice.Head.get_filtered_block_tree/1
    (lambda_ethereum_consensus 0.1.0) lib/lambda_ethereum_consensus/fork_choice/head.ex:14: LambdaEthereumConsensus.ForkChoice.Head.get_head/1
    (lambda_ethereum_consensus 0.1.0) lib/lambda_ethereum_consensus/fork_choice/fork_choice.ex:272: LambdaEthereumConsensus.ForkChoice.recompute_head/1
    (lambda_ethereum_consensus 0.1.0) lib/lambda_ethereum_consensus/fork_choice/fork_choice.ex:62: anonymous fn/1 in LambdaEthereumConsensus.ForkChoice.on_block/2
    (telemetry 1.3.0) /Users/roliveri/prog/elixir/consensus/lambda_ethereum_consensus/deps/telemetry/src/telemetry.erl:324: :telemetry.span/3
Last message: {#Port<0.19>, {:data, <<10, 152, 76, 10, 40, 47, 101, 116, 104, 50, 47, 100, 51, 49, 102, 54, 49, 57, 49, 47, 98, 108, 111, 98, 95, 115, 105, 100, 101, 99, 97, 114, 95, 49, 47, 115, 115, 122, 95, 115, 110, 97, 112, 112, 121, 26, ...>>}} 
INFO 21:14:28.106 [Optimistic Sync] Waiting 10.0 seconds to discover some peers before requesting blocks. 
INFO 21:14:28.107 [Fork choice] Adding new block slot=6000672 root=0x611..070f
rodrigo-o commented 1 month ago

In #1316 I was able to check that the store is in al old state and that cause the setting of the block as invalid:

INFO 16:00:48.604 [Fork choice] Adding new block slot=6013504 root=0xaa1..c455
INFO 16:00:48.802 [Fork choice] Block processed. Recomputing head. 
ERROR 16:00:48.805 GenServer LambdaEthereumConsensus.Libp2pPort terminating
** (RuntimeError) Parent 8C7A76877C9CD8052DFF5995378515B9140DF11467672AEFFA7D0E3A1B9AE05B not found in tree
    (lambda_ethereum_consensus 0.1.0) lib/lambda_ethereum_consensus/fork_choice/simple_tree.ex:84: LambdaEthereumConsensus.ForkChoice.Simple.Tree.get_children!/2
    (lambda_ethereum_consensus 0.1.0) lib/types/store.ex:144: Types.Store.get_children/2
    (lambda_ethereum_consensus 0.1.0) lib/lambda_ethereum_consensus/fork_choice/head.ex:90: LambdaEthereumConsensus.ForkChoice.Head.filter_block_tree/4
    (lambda_ethereum_consensus 0.1.0) lib/lambda_ethereum_consensus/fork_choice/head.ex:85: LambdaEthereumConsensus.ForkChoice.Head.get_filtered_block_tree/1
    (lambda_ethereum_consensus 0.1.0) lib/lambda_ethereum_consensus/fork_choice/head.ex:14: LambdaEthereumConsensus.ForkChoice.Head.get_head/1
    (lambda_ethereum_consensus 0.1.0) iex:296: LambdaEthereumConsensus.ForkChoice.recompute_head/1
    (lambda_ethereum_consensus 0.1.0) iex:62: anonymous fn/1 in LambdaEthereumConsensus.ForkChoice.on_block/2
    (telemetry 1.3.0) /Users/roliveri/prog/elixir/consensus/lambda_ethereum_consensus/deps/telemetry/src/telemetry.erl:324: :telemetry.span/3
Last message: {#Port<0.19>, {:data, <<10, 174, 186, 1, 10, 38, 47, 101, 116, 104, 50, 47, 100, 51, 49, 102, 54, 49, 57, 49, 47, 98, 101, 97, 99, 111, 110, 95, 98, 108, 111, 99, 107, 47, 115, 115, 122, 95, 115, 110, 97, 112, 112, 121, 26, 20, ...>>}} 
INFO 16:00:48.828 [Optimistic Sync] Waiting 10.0 seconds to discover some peers before requesting blocks. 
INFO 16:00:48.828 [Fork choice] Adding new block slot=6013504 root=0xaa1..c455
INFO 16:00:48.881 Store slot: 6013457 is in the past, slot: 6013504 is in the future. Current chain slot: 6013504 
ERROR 16:00:48.881 [Fork choice] Failed to add block: block is from the future slot=6013504 root=0xaa1..c455
ERROR 16:00:48.882 [PendingBlocks] Saving block as invalid block is from the future slot=6013504 root=0xaa1..c455

Either way the root cause is the same, the tree manage in the Recomputing head

rodrigo-o commented 1 month ago

Resolved by #1318