sigp / lighthouse

Ethereum consensus client in Rust
https://lighthouse.sigmaprime.io/
Apache License 2.0
2.9k stars 737 forks source link

Sync Stall On Lighthouse/v5.1.3-d7191c5 #5732

Open AgeManning opened 5 months ago

AgeManning commented 5 months ago

Description

Witnessed a sync stall.

It stall on batch 280494.

The logs show we have downloaded it it's awaiting, but after 280493 we don't seem to start processing it.

Relevant Logs:

May 06 23:41:49.781 DEBG Adding peer to known chain              id: 7155462887427261574, from: 280487, to: 281809, end_root: 0x8d05…335e, current_target: 280488, batches: 8, peers: 49, validated_batches: 33, state: Syncing, sync_type: Finalized, peer_id: 16Uiu2HAmN3bFuQJWaX2tDH5djyhDKGZnqLykaYLSSswNueVwQucM, service: sync
May 06 23:41:50.476 DEBG Batch processed                         service: sync, processed_blocks: 32, last_block_slot: 8975647, chain: 7155462887427261574, first_block_slot: 8975616, batch_epoch: 280488
May 06 23:41:50.476 DEBG Batch processing result                 client: Nimbus: version: unknown, os_version: unknown, batch_epoch: 280488, start_slot: 8975616, end_slot: 8975647, downloaded: 0, processed: 0, processed_no_penalty: 0, state: Processing(16Uiu2HAmFw6GhGLj8YqZTkPtAmHaKidfNCG8iCNYPYh99Zi55Nsk), batch_ty: blocks_and_blobs, result: Success { was_non_empty: true }, chain: 7155462887427261574, service: sync
May 06 23:41:50.476 DEBG Chain advanced                          processing_target: 280488, new_start: 280488, previous_start: 280487, chain: 7155462887427261574, service: sync
May 06 23:41:51.791 DEBG Completed batch received                awaiting_batches: 5, blocks: 32, epoch: 280494, chain: 7155462887427261574, service: sync
May 06 23:41:53.001 DEBG Slot timer                              sync_state: Syncing Finalized Chain, current_slot: 9017907, head_slot: 8975647, head_block: 0x3f44…1726, finalized_epoch: 280487, finalized_root: 0xa263…7640, peers: 49, service: slot_notifier
May 06 23:41:53.001 INFO Syncing                                 est_time: 1 hr 57 mins, speed: 6.00 slots/sec, distance: 42260 slots (5 days 20 hrs), peers: 49, service: slot_notifier
May 06 23:41:53.001 DEBG Eth1 cache sync status                  ready: true, voting_target_timestamp: 1715003479, latest_cached_timestamp: 1715019815, latest_cached_block_number: 19812804, eth1_head_block: 19814340, service: slot_notifier
May 06 23:41:57.000 DEBG Processed HTTP API request              method: GET, path: /eth/v1/node/syncing, status: 200 OK, elapsed: 98.504µs
May 06 23:41:58.004 DEBG Processed HTTP API request              method: GET, path: /eth/v1/node/syncing, status: 200 OK, elapsed: 407.37µs
May 06 23:41:58.965 DEBG Completed batch received                awaiting_batches: 0, blocks: 30, epoch: 280489, chain: 7155462887427261574, service: sync
May 06 23:41:59.006 DEBG Processed HTTP API request              method: GET, path: /eth/v1/node/syncing, status: 200 OK, elapsed: 77.538µs
May 06 23:42:02.914 DEBG Batch processed                         service: sync, processed_blocks: 30, last_block_slot: 8975679, chain: 7155462887427261574, first_block_slot: 8975648, batch_epoch: 280489
May 06 23:42:02.914 DEBG Batch processing result                 client: Prysm: version: Local build, os_version: unknown, batch_epoch: 280489, start_slot: 8975648, end_slot: 8975679, downloaded: 1, processed: 0, processed_no_penalty: 0, state: Processing(16Uiu2HAkv1Ey958GUXLKVDuwiqgxqVSkUHP9mFVva18TQ3K5b9bZ), batch_ty: blocks_and_blobs, result: Success { was_non_empty: true }, chain: 7155462887427261574, service: sync
May 06 23:42:02.914 DEBG Chain advanced                          processing_target: 280489, new_start: 280489, previous_start: 280488, chain: 7155462887427261574, service: sync
May 06 23:42:02.914 DEBG Sending BlocksByRange and BlobsByRange requests, peer: 16Uiu2HAmS4QNZDLgpTqY6B9EPAExKbdHCTuFcgYgQXmkANP2WDs2, count: 32, method: Mixed by range request, service: sync
May 06 23:42:02.914 DEBG Requesting batch                        start_slot: 8975840, end_slot: 8975871, downloaded: 0, processed: 0, processed_no_penalty: 0, state: Downloading(16Uiu2HAmS4QNZDLgpTqY6B9EPAExKbdHCTuFcgYgQXmkANP2WDs2, 0 blocks, 48), batch_ty: blocks_and_blobs, epoch: 280495, chain: 7155462887427261574, service: sync
May 06 23:42:05.004 DEBG Slot timer                              sync_state: Syncing Finalized Chain, current_slot: 9017908, head_slot: 8975679, head_block: 0x5aab…1b55, finalized_epoch: 280488, finalized_root: 0x5873…eba4, peers: 45, service: slot_notifier
May 06 23:42:05.004 INFO Syncing                                 est_time: 2 hrs 30 mins, speed: 4.67 slots/sec, distance: 42229 slots (5 days 20 hrs), peers: 45, service: slot_notifier
May 06 23:42:05.004 DEBG Eth1 cache sync status                  ready: true, voting_target_timestamp: 1715003479, latest_cached_timestamp: 1715019815, latest_cached_block_number: 19812804, eth1_head_block: 19814340, service: slot_notifier
May 06 23:42:06.037 DEBG Completed batch received                awaiting_batches: 5, blocks: 31, epoch: 280495, chain: 7155462887427261574, service: sync
May 06 23:42:06.037 DEBG Sending BlocksByRange and BlobsByRange requests, peer: 16Uiu2HAmFw6GhGLj8YqZTkPtAmHaKidfNCG8iCNYPYh99Zi55Nsk, count: 32, method: Mixed by range request, service: sync
May 06 23:42:06.037 DEBG Requesting batch                        start_slot: 8975872, end_slot: 8975903, downloaded: 0, processed: 0, processed_no_penalty: 0, state: Downloading(16Uiu2HAmFw6GhGLj8YqZTkPtAmHaKidfNCG8iCNYPYh99Zi55Nsk, 0 blocks, 49), batch_ty: blocks_and_blobs, epoch: 280496, chain: 7155462887427261574, service: sync
May 06 23:42:06.855 DEBG Batch processed                         service: sync, processed_blocks: 32, last_block_slot: 8975711, chain: 7155462887427261574, first_block_slot: 8975680, batch_epoch: 280490
May 06 23:42:06.855 DEBG Batch processing result                 client: Nimbus: version: unknown, os_version: unknown, batch_epoch: 280490, start_slot: 8975680, end_slot: 8975711, downloaded: 0, processed: 0, processed_no_penalty: 0, state: Processing(16Uiu2HAmLPj6x8vbPP8udDMHZYCoQk3M3XWECFog8iE9AE1kQoDh), batch_ty: blocks_and_blobs, result: Success { was_non_empty: true }, chain: 7155462887427261574, service: sync
May 06 23:42:06.855 DEBG Chain advanced                          processing_target: 280490, new_start: 280490, previous_start: 280489, chain: 7155462887427261574, service: sync
May 06 23:42:09.001 DEBG Processed HTTP API request              method: GET, path: /eth/v1/node/syncing, status: 200 OK, elapsed: 83.215µs
May 06 23:42:10.003 DEBG Processed HTTP API request              method: GET, path: /eth/v1/node/syncing, status: 200 OK, elapsed: 123.609µs
May 06 23:42:10.052 DEBG Finalization sync peer joined           peer_id: 16Uiu2HAkzXhqhZZmDE8VkqE2AV9361ozPh3Kag3KQsFWmP5NLXVM, service: sync
May 06 23:42:10.052 DEBG Adding peer to known chain              id: 7155462887427261574, from: 280490, to: 281809, end_root: 0x8d05…335e, current_target: 280491, batches: 7, peers: 44, validated_batches: 36, state: Syncing, sync_type: Finalized, peer_id: 16Uiu2HAkzXhqhZZmDE8VkqE2AV9361ozPh3Kag3KQsFWmP5NLXVM, service: sync
May 06 23:42:10.052 DEBG Sending BlocksByRange and BlobsByRange requests, peer: 16Uiu2HAkzXhqhZZmDE8VkqE2AV9361ozPh3Kag3KQsFWmP5NLXVM, count: 32, method: Mixed by range request, service: sync
May 06 23:42:10.052 DEBG Requesting batch                        start_slot: 8975904, end_slot: 8975935, downloaded: 0, processed: 0, processed_no_penalty: 0, state: Downloading(16Uiu2HAkzXhqhZZmDE8VkqE2AV9361ozPh3Kag3KQsFWmP5NLXVM, 0 blocks, 50), batch_ty: blocks_and_blobs, epoch: 280497, chain: 7155462887427261574, service: sync
May 06 23:42:10.686 DEBG Batch processed                         service: sync, processed_blocks: 32, last_block_slot: 8975743, chain: 7155462887427261574, first_block_slot: 8975712, batch_epoch: 280491
May 06 23:42:10.686 DEBG Batch processing result                 client: Prysm: version: 38f208d70dc95b12c08403f5c72009aaa10dfe2f, os_version: unknown, batch_epoch: 280491, start_slot: 8975712, end_slot: 8975743, downloaded: 0, processed: 0, processed_no_penalty: 0, state: Processing(16Uiu2HAmDvc7kJfccDycyoNgNuV6ekcQir9j7iJu79AyHXgbtztF), batch_ty: blocks_and_blobs, result: Success { was_non_empty: true }, chain: 7155462887427261574, service: sync
May 06 23:42:10.686 DEBG Chain advanced                          processing_target: 280491, new_start: 280491, previous_start: 280490, chain: 7155462887427261574, service: sync
May 06 23:42:11.003 DEBG Processed HTTP API request              method: GET, path: /eth/v1/node/syncing, status: 200 OK, elapsed: 31.589µs
May 06 23:42:12.711 DEBG Completed batch received                awaiting_batches: 4, blocks: 32, epoch: 280496, chain: 7155462887427261574, service: sync
May 06 23:42:12.711 DEBG Sending BlocksByRange and BlobsByRange requests, peer: 16Uiu2HAmNzj3AcpbLP2QUZUj8wREa1U1ecME857xgC84dYSzDUdQ, count: 32, method: Mixed by range request, service: sync
May 06 23:42:12.711 DEBG Requesting batch                        start_slot: 8975936, end_slot: 8975967, downloaded: 0, processed: 0, processed_no_penalty: 0, state: Downloading(16Uiu2HAmNzj3AcpbLP2QUZUj8wREa1U1ecME857xgC84dYSzDUdQ, 0 blocks, 51), batch_ty: blocks_and_blobs, epoch: 280498, chain: 7155462887427261574, service: sync
May 06 23:42:13.771 DEBG Range sync request failed               batch_id: Epoch(280497), chain_id: 7155462887427261574, batch_type: BlocksAndBlobs, request_id: 50, service: sync
May 06 23:42:13.771 DEBG Batch failed. RPC Error                 request_id: 50, peer_id: 16Uiu2HAkzXhqhZZmDE8VkqE2AV9361ozPh3Kag3KQsFWmP5NLXVM, batch_state: Downloading(16Uiu2HAkzXhqhZZmDE8VkqE2AV9361ozPh3Kag3KQsFWmP5NLXVM, 0 blocks, 50), batch_epoch: 280497, chain: 7155462887427261574, service: sync
May 06 23:42:13.771 DEBG Sending BlocksByRange and BlobsByRange requests, peer: 16Uiu2HAkv1Ey958GUXLKVDuwiqgxqVSkUHP9mFVva18TQ3K5b9bZ, count: 32, method: Mixed by range request, service: sync
May 06 23:42:13.771 DEBG Requesting batch                        start_slot: 8975904, end_slot: 8975935, downloaded: 1, processed: 0, processed_no_penalty: 0, state: Downloading(16Uiu2HAkv1Ey958GUXLKVDuwiqgxqVSkUHP9mFVva18TQ3K5b9bZ, 0 blocks, 52), batch_ty: blocks_and_blobs, epoch: 280497, chain: 7155462887427261574, service: sync
May 06 23:42:14.203 DEBG Range sync request failed               batch_type: BlocksAndBlobs, request_id: 50, service: sync
May 06 23:42:14.673 DEBG Batch processed                         service: sync, processed_blocks: 32, last_block_slot: 8975775, chain: 7155462887427261574, first_block_slot: 8975744, batch_epoch: 280492
May 06 23:42:14.673 DEBG Batch processing result                 client: Nimbus: version: unknown, os_version: unknown, batch_epoch: 280492, start_slot: 8975744, end_slot: 8975775, downloaded: 0, processed: 0, processed_no_penalty: 0, state: Processing(16Uiu2HAmFw6GhGLj8YqZTkPtAmHaKidfNCG8iCNYPYh99Zi55Nsk), batch_ty: blocks_and_blobs, result: Success { was_non_empty: true }, chain: 7155462887427261574, service: sync
May 06 23:42:14.673 DEBG Chain advanced                          processing_target: 280492, new_start: 280492, previous_start: 280491, chain: 7155462887427261574, service: sync
May 06 23:42:16.977 DEBG Completed batch received                awaiting_batches: 5, blocks: 32, epoch: 280498, chain: 7155462887427261574, service: sync
May 06 23:42:16.977 DEBG Sending BlocksByRange and BlobsByRange requests, peer: 16Uiu2HAmQbCd135BMQJT7FJybVwJva8uerrmUKWdoGbMpErSxrRJ, count: 32, method: Mixed by range request, service: sync
May 06 23:42:16.977 DEBG Requesting batch                        start_slot: 8975968, end_slot: 8975999, downloaded: 0, processed: 0, processed_no_penalty: 0, state: Downloading(16Uiu2HAmQbCd135BMQJT7FJybVwJva8uerrmUKWdoGbMpErSxrRJ, 0 blocks, 53), batch_ty: blocks_and_blobs, epoch: 280499, chain: 7155462887427261574, service: sync
May 06 23:42:17.000 DEBG Slot timer                              sync_state: Syncing Finalized Chain, current_slot: 9017909, head_slot: 8975775, head_block: 0x5f1f…0b90, finalized_epoch: 280491, finalized_root: 0x5065…dcb8, peers: 44, service: slot_notifier
May 06 23:42:17.000 INFO Syncing                                 est_time: 2 hrs 11 mins, speed: 5.33 slots/sec, distance: 42134 slots (5 days 20 hrs), peers: 44, service: slot_notifier
May 06 23:42:17.000 DEBG Eth1 cache sync status                  ready: true, voting_target_timestamp: 1715003479, latest_cached_timestamp: 1715019815, latest_cached_block_number: 19812804, eth1_head_block: 19814340, service: slot_notifier
May 06 23:42:18.513 DEBG Batch processed                         service: sync, processed_blocks: 32, last_block_slot: 8975807, chain: 7155462887427261574, first_block_slot: 8975776, batch_epoch: 280493
May 06 23:42:18.513 DEBG Batch processing result                 client: Prysm: version: a1a81d1720a0a3b850992d4825d0a023baa8e65a, os_version: unknown, batch_epoch: 280493, start_slot: 8975776, end_slot: 8975807, downloaded: 0, processed: 0, processed_no_penalty: 0, state: Processing(16Uiu2HAm6eBhfxPfkNZdRR5CpokWnF78zCit4vXZDDKDQ2oL7cDZ), batch_ty: blocks_and_blobs, result: Success { was_non_empty: true }, chain: 7155462887427261574, service: sync
May 06 23:42:18.513 DEBG Chain advanced                          processing_target: 280493, new_start: 280493, previous_start: 280492, chain: 7155462887427261574, service: sync
May 06 23:42:18.880 DEBG Peer transitioned sync state            is_connected: true, their_finalized_epoch: 281807, their_head_slot: 9017909, our_finalized_epoch: 280492, our_head_slot: 8975807, new_state: Advanced, peer_id: 16Uiu2HAm2aLZn6y7mGZxbF1UntkvWnv1SK7cpEcyJ7HE2RYNZQ1k, service: sync
May 06 23:42:18.880 DEBG Finalization sync peer joined           peer_id: 16Uiu2HAm2aLZn6y7mGZxbF1UntkvWnv1SK7cpEcyJ7HE2RYNZQ1k, service: sync
May 06 23:42:18.880 DEBG Adding peer to known chain              id: 7155462887427261574, from: 280493, to: 281809, end_root: 0x8d05…335e, current_target: 280494, batches: 7, peers: 44, validated_batches: 39, state: Syncing, sync_type: Finalized, peer_id: 16Uiu2HAm2aLZn6y7mGZxbF1UntkvWnv1SK7cpEcyJ7HE2RYNZQ1k, service: sync
May 06 23:42:18.880 DEBG Sending BlocksByRange and BlobsByRange requests, peer: 16Uiu2HAmN3bFuQJWaX2tDH5djyhDKGZnqLykaYLSSswNueVwQucM, count: 32, method: Mixed by range request, service: sync
May 06 23:42:18.880 DEBG Requesting batch                        start_slot: 8976000, end_slot: 8976031, downloaded: 0, processed: 0, processed_no_penalty: 0, state: Downloading(16Uiu2HAmN3bFuQJWaX2tDH5djyhDKGZnqLykaYLSSswNueVwQucM, 0 blocks, 54), batch_ty: blocks_and_blobs, epoch: 280500, chain: 7155462887427261574, service: sync
May 06 23:42:19.322 DEBG Peer transitioned sync state            is_connected: true, their_finalized_epoch: 281807, their_head_slot: 9017909, our_finalized_epoch: 280492, our_head_slot: 8975807, new_state: Advanced, peer_id: 16Uiu2HAmSaxj6VPkdn272uRSA31bMi5jGcsCpFPzBxHnA5NMthRb, service: sync
May 06 23:42:19.322 DEBG Finalization sync peer joined           peer_id: 16Uiu2HAmSaxj6VPkdn272uRSA31bMi5jGcsCpFPzBxHnA5NMthRb, service: sync
May 06 23:42:19.322 DEBG Adding peer to known chain              id: 7155462887427261574, from: 280493, to: 281809, end_root: 0x8d05…335e, current_target: 280494, batches: 8, peers: 44, validated_batches: 39, state: Syncing, sync_type: Finalized, peer_id: 16Uiu2HAmSaxj6VPkdn272uRSA31bMi5jGcsCpFPzBxHnA5NMthRb, service: sync
May 06 23:42:20.303 DEBG Peer transitioned sync state            is_connected: false, their_finalized_epoch: 281807, their_head_slot: 9017909, our_finalized_epoch: 280492, our_head_slot: 8975807, new_state: Advanced, peer_id: 16Uiu2HAm8WQMCYnKurYAH72txMjrk9CH9NDNgpcv5BLsyXwFPKio, service: sync
May 06 23:42:20.303 DEBG Peer transitioned sync state            is_connected: true, their_finalized_epoch: 281807, their_head_slot: 9017909, our_finalized_epoch: 280492, our_head_slot: 8975807, new_state: Advanced, peer_id: 16Uiu2HAmSCpFEs7ny4dibBU1uXEwdAmDWgRmZsc9oK6g33YPg3oN, service: sync
May 06 23:42:20.303 DEBG Finalization sync peer joined           peer_id: 16Uiu2HAmSCpFEs7ny4dibBU1uXEwdAmDWgRmZsc9oK6g33YPg3oN, service: sync
May 06 23:42:20.303 DEBG Adding peer to known chain              id: 7155462887427261574, from: 280493, to: 281809, end_root: 0x8d05…335e, current_target: 280494, batches: 8, peers: 45, validated_batches: 39, state: Syncing, sync_type: Finalized, peer_id: 16Uiu2HAmSCpFEs7ny4dibBU1uXEwdAmDWgRmZsc9oK6g33YPg3oN, service: sync
May 06 23:42:20.800 DEBG Completed batch received                awaiting_batches: 3, blocks: 32, epoch: 280497, chain: 7155462887427261574, service: sync
May 06 23:42:21.017 DEBG Processed HTTP API request              method: GET, path: /eth/v1/node/syncing, status: 200 OK, elapsed: 16.846475ms
May 06 23:42:22.866 DEBG Completed batch received                awaiting_batches: 6, blocks: 32, epoch: 280500, chain: 7155462887427261574, service: sync
May 06 23:42:23.497 DEBG Completed batch received                awaiting_batches: 5, blocks: 32, epoch: 280499, chain: 7155462887427261574, service: sync
May 06 23:42:29.001 DEBG Slot timer                              sync_state: Syncing Finalized Chain, current_slot: 9017910, head_slot: 8975807, head_block: 0xfe44…1ccc, finalized_epoch: 280492, finalized_root: 0x16fe…f9db, peers: 45, service: slot_notifier
May 06 23:42:29.001 INFO Syncing                                 est_time: 2 hrs 11 mins, speed: 5.33 slots/sec, distance: 42103 slots (5 days 20 hrs), peers: 45, service: slot_notifier
May 06 23:42:29.001 DEBG Eth1 cache sync status                  ready: true, voting_target_timestamp: 1715003479, latest_cached_timestamp: 1715019815, latest_cached_block_number: 19812804, eth1_head_block: 19814340, service: slot_notifier
May 06 23:42:41.000 DEBG Slot timer                              sync_state: Syncing Finalized Chain, current_slot: 9017911, head_slot: 8975807, head_block: 0xfe44…1ccc, finalized_epoch: 280492, finalized_root: 0x16fe…f9db, peers: 42, service: slot_notifier
May 06 23:42:41.000 INFO Syncing                                 est_time: 3 hrs 30 mins, speed: 3.33 slots/sec, distance: 42104 slots (5 days 20 hrs), peers: 42, service: slot_notifier
May 06 23:42:41.000 DEBG Eth1 cache sync status                  ready: true, voting_target_timestamp: 1715003479, latest_cached_timestamp: 1715019815, latest_cached_block_number: 19812804, eth1_head_block: 19814340, service: slot_notifier
May 06 23:42:53.000 DEBG Slot timer                              sync_state: Syncing Finalized Chain, current_slot: 9017912, head_slot: 8975807, head_block: 0xfe44…1ccc, finalized_epoch: 280492, finalized_root: 0x16fe…f9db, peers: 46, service: slot_notifier
May 06 23:42:53.000 INFO Syncing                                 est_time: 4 hrs 23 mins, speed: 2.67 slots/sec, distance: 42105 slots (5 days 20 hrs), peers: 46, service: slot_notifier
AgeManning commented 5 months ago

Further debugging logs if needed: more.log

AgeManning commented 5 months ago

I looked into this. This actually seems to be block processing. It seems to take up to a few minutes in order to process 1 epoch worth of blocks.

Sync typically recovers. The processing delay is something worth looking into at some point.

cc @michaelsproul - Kind weird its taking > 2-3mins to process a batch of blocks (this is a very fast machine).

michaelsproul commented 5 months ago

@AgeManning can you point out in the logs above where it's taking 2-3 mins to import a batch of blocks? All I see is consistent syncing at 3-6 slots/second, and some blocks importing in around 40-50ms:

May 06 23:42:21.692 DEBG Beacon block imported block_slot: 8975836, block_root: 0x1193966416747942a4849b41c56f3f9935f7e5ce898110d4b938b34cabd84e39, service: beacon May 06 23:42:21.741 DEBG Beacon block imported block_slot: 8975837, block_root: 0x6c644c9e48ffc47b6ab7ce610d9b108cfd67a73658706e31a0f62451761bd35b, service: beacon May 06 23:42:21.784 DEBG Beacon block imported block_slot: 8975838, block_root: 0xc1b33bec342e7769f5f6081c7163023c1ac7dcfcecbc4491c41c76f77fbabb0d, service: beacon

Even at 3 slots/second, a batch of 32 blocks should take around 10s

AgeManning commented 5 months ago

Sent out of band