hyperledger / besu

An enterprise-grade Java-based, Apache 2.0 licensed Ethereum client https://wiki.hyperledger.org/display/besu
https://www.hyperledger.org/projects/besu
Apache License 2.0
1.5k stars 822 forks source link

Large blocks that take over 3-4s to import trigger a backward sync session causing a 6-8 block gaps each time #5316

Closed ibhagwan closed 1 year ago

ibhagwan commented 1 year ago

Description

When a larger than usual block is imported (with > 90% gas utilization), the block import time will be unusually long (unsure why, as this doesn't happen with all > 90% gas blocks), when this happens Besu will trigger a new backward sync session causing besu to be out of sync for a short period of time (6-8) blocks resulting in missed attestations.

Expected behavior: Since blocks are roughly every ~12s I'm not sure why Besu needs a new backward sync session after being delayed for what seems like 3-5s tops.

Few examples (logs are otherwise clean of warnings/errors):

2023-04-02 23:40:39.093-04:00 | vert.x-worker-thread-0 | INFO  | EngineNewPayload | Imported #16,965,905 / 149 tx / base fee 17.67 gwei / 29,985,480 (100.0%) gas / (0x7bdfa719422f03d9026ef30518394700f3585767d9e7c59b1ae0951c9f9d064f) in 1.575s. Peers: 82
2023-04-02 23:41:48.313-04:00 | vert.x-worker-thread-0 | INFO  | BackwardSyncContext | Starting a new backward sync session
2023-04-02 23:41:48.605-04:00 | nioEventLoopGroup-3-1 | INFO  | BackwardSyncStep | Backward sync phase 1 of 2 completed, downloaded a total of 200 headers. Peers: 83
2023-04-02 23:41:49.450-04:00 | nioEventLoopGroup-3-1 | INFO  | BackwardSyncContext | Backward sync phase 2 of 2, 16.67% completed, imported 1 blocks of at least 6 (current head 16965906, target head 16965911). Peers: 82
2023-04-02 23:41:52.252-04:00 | ForkJoinPool.commonPool-worker-1 | INFO  | BackwardSyncContext | Backward sync phase 2 of 2 completed, imported a total of 6 blocks. Peers: 82
2023-04-02 23:41:52.253-04:00 | ForkJoinPool.commonPool-worker-1 | INFO  | BackwardsSyncAlgorithm | Current backward sync session is done
2023-04-02 23:42:00.144-04:00 | vert.x-worker-thread-0 | INFO  | EngineNewPayload | Imported #16,965,912 / 48 tx / base fee 17.74 gwei / 5,404,592 (18.0%) gas / (0x720e1b57db756f25e63c495d91c7cdbc83f7d59b7d1fec0aa4e3608a6c333dde) in 0.281s. Peers: 82
2023-04-02 23:56:39.574-04:00 | vert.x-worker-thread-0 | INFO  | EngineNewPayload | Imported #16,965,984 / 135 tx / base fee 16.50 gwei / 26,929,761 (89.8%) gas / (0x66a194614e43d2c7a4850bfb98579685bf5cc8442a4088fea422e243222edeca) in 3.381s. Peers: 83
2023-04-02 23:57:47.802-04:00 | vert.x-worker-thread-0 | INFO  | BackwardSyncContext | Starting a new backward sync session
2023-04-02 23:57:48.115-04:00 | nioEventLoopGroup-3-1 | INFO  | BackwardSyncStep | Backward sync phase 1 of 2 completed, downloaded a total of 200 headers. Peers: 83
2023-04-02 23:57:48.979-04:00 | nioEventLoopGroup-3-1 | INFO  | BackwardSyncContext | Backward sync phase 2 of 2, 16.67% completed, imported 1 blocks of at least 6 (current head 16965985, target head 16965990). Peers: 83
2023-04-02 23:57:53.225-04:00 | ForkJoinPool.commonPool-worker-4 | INFO  | BackwardSyncContext | Backward sync phase 2 of 2 completed, imported a total of 6 blocks. Peers: 84
2023-04-02 23:57:53.225-04:00 | ForkJoinPool.commonPool-worker-4 | INFO  | BackwardsSyncAlgorithm | Current backward sync session is done
2023-04-02 23:58:01.087-04:00 | vert.x-worker-thread-0 | INFO  | EngineNewPayload | Imported #16,965,991 / 114 tx / base fee 18.48 gwei / 15,729,908 (52.4%) gas / (0x7cac53cf47feecb668df7915bb415f6d6c8a4df2514a3798ae982cd381c656f7) in 0.704s. Peers: 83
2023-04-03 02:51:41.278-04:00 | vert.x-worker-thread-0 | INFO  | EngineNewPayload | Imported #16,966,847 / 172 tx / base fee 16.06 gwei / 29,997,876 (100.0%) gas / (0x24d17b0e3e1c937247fef2bfd7bf553aacce477b3d36e82dcdcdc96888189263) in 4.017s. Peers: 86
2023-04-03 02:52:49.550-04:00 | vert.x-worker-thread-0 | INFO  | BackwardSyncContext | Starting a new backward sync session
2023-04-03 02:52:50.393-04:00 | nioEventLoopGroup-3-10 | INFO  | BackwardSyncStep | Backward sync phase 1 of 2 completed, downloaded a total of 200 headers. Peers: 86
2023-04-03 02:52:51.595-04:00 | nioEventLoopGroup-3-10 | INFO  | BackwardSyncContext | Backward sync phase 2 of 2, 16.67% completed, imported 1 blocks of at least 6 (current head 16966848, target head 16966853). Peers: 86
2023-04-03 02:52:56.672-04:00 | ForkJoinPool.commonPool-worker-6 | INFO  | BackwardSyncContext | Backward sync phase 2 of 2 completed, imported a total of 6 blocks. Peers: 86
2023-04-03 02:52:56.672-04:00 | ForkJoinPool.commonPool-worker-6 | INFO  | BackwardsSyncAlgorithm | Current backward sync session is done
2023-04-03 02:53:00.767-04:00 | vert.x-worker-thread-0 | INFO  | EngineNewPayload | Imported #16,966,854 / 153 tx / base fee 18.93 gwei / 10,487,277 (35.0%) gas / (0xf01053e3fbe8933990d5c51728b045108939130d443845082a85a11ce6a4e730) in 0.516s. Peers: 86
2023-04-05 02:39:53.515-04:00 | vert.x-worker-thread-0 | INFO  | EngineNewPayload | Imported #16,980,870 / 395 tx / base fee 27.00 gwei / 29,987,418 (100.0%) gas / (0x4d7652e0f19f7d554af7e3ae2083713b1706c0ee5fe0b40a384238e98eac3be8) in 4.095s. Peers: 72
2023-04-05 02:41:00.183-04:00 | vert.x-worker-thread-0 | INFO  | BackwardSyncContext | Starting a new backward sync session
2023-04-05 02:41:08.083-04:00 | nioEventLoopGroup-3-3 | INFO  | BackwardSyncStep | Backward sync phase 1 of 2 completed, downloaded a total of 200 headers. Peers: 74
2023-04-05 02:41:11.036-04:00 | nioEventLoopGroup-3-3 | INFO  | BackwardSyncContext | Backward sync phase 2 of 2, 16.67% completed, imported 1 blocks of at least 6 (current head 16980871, target head 16980876). Peers: 73
2023-04-05 02:41:15.637-04:00 | ForkJoinPool.commonPool-worker-8 | INFO  | BackwardSyncContext | Backward sync phase 2 of 2 completed, imported a total of 7 blocks. Peers: 73
2023-04-05 02:41:15.637-04:00 | ForkJoinPool.commonPool-worker-8 | INFO  | BackwardsSyncAlgorithm | Current backward sync session is done
2023-04-05 02:41:26.162-04:00 | vert.x-worker-thread-0 | INFO  | EngineNewPayload | Imported #16,980,878 / 123 tx / base fee 30.30 gwei / 14,577,888 (48.6%) gas / (0x3a2727c14294d95532168c27e88d2e7ee90a75e1330fbab6eae2ae344c8010ba) in 0.859s. Peers: 73

Important to note that block import times are usually < 1-2s even when blocks are >90% utlization, so I'm not sure why some blocks are so expensive, for example:

2023-04-05 11:55:48.778-04:00 | vert.x-worker-thread-0 | INFO  | EngineForkchoiceUpdated | VALID for fork-choice-update: head: 0x1b3817526a08c5575e7e3eaf2cacb2f916bfab7c543ac6f9bb080ef12fff8e64, finalized: 0xc1c9c3c3fe95c3ae6ffc0308b8d0b8670548b16e51c88c85fd07ef8eb186a8c9, safeBlockHash: 0x13500199ec22220a30a15f2ce98c42b0a4e1691f44d0982880d7cfac81d73046
2023-04-05 11:56:02.580-04:00 | vert.x-worker-thread-0 | INFO  | EngineNewPayload | Imported #16,983,601 / 190 tx / base fee 33.52 gwei / 16,656,164 (55.5%) gas / (0x72cfbf942728109f84fc0dbd04bfaa13563a2ef82b1f6538bfb79d677bb68d24) in 0.809s. Peers: 70
2023-04-05 11:56:10.764-04:00 | vert.x-worker-thread-0 | INFO  | EngineNewPayload | Imported #16,983,602 / 0 tx / base fee 33.98 gwei / 0 (0.0%) gas / (0xcd3031e0782b1f20084113691f3272d5875ee6d808dc0d59f8140ef2130387e5) in 0.045s. Peers: 71
2023-04-05 11:56:26.781-04:00 | vert.x-worker-thread-0 | INFO  | EngineNewPayload | Imported #16,983,603 / 299 tx / base fee 29.73 gwei / 29,980,191 (99.9%) gas / (0x69f8584830846cd16cef332b9cbdba361a3d1bf9b5f20af901362f469fabb99b) in 1.771s. Peers: 71
2023-04-05 11:56:37.428-04:00 | vert.x-worker-thread-0 | INFO  | EngineNewPayload | Imported #16,983,604 / 162 tx / base fee 33.44 gwei / 17,478,256 (58.3%) gas / (0x842119b984e113c2a17b9f605dc3011f5e666aa5ab916a65c01214c378742e07) in 1.023s. Peers: 71
2023-04-05 11:56:49.184-04:00 | vert.x-worker-thread-0 | INFO  | EngineNewPayload | Imported #16,983,605 / 140 tx / base fee 34.13 gwei / 11,426,213 (38.1%) gas / (0x01d567931b8638e987f40a0793fbf5becd322eb0145eea13cabdcff403ac1162) in 0.687s. Peers: 70
2023-04-05 11:56:49.450-04:00 | vert.x-worker-thread-0 | INFO  | EngineForkchoiceUpdated | VALID for fork-choice-update: head: 0x01d567931b8638e987f40a0793fbf5becd322eb0145eea13cabdcff403ac1162, finalized: 0xc1c9c3c3fe95c3ae6ffc0308b8d0b8670548b16e51c88c85fd07ef8eb186a8c9, safeBlockHash: 0x13500199ec22220a30a15f2ce98c42b0a4e1691f44d0982880d7cfac81d73046
2023-04-05 11:57:03.950-04:00 | vert.x-worker-thread-0 | INFO  | EngineNewPayload | Imported #16,983,606 / 222 tx / base fee 33.12 gwei / 25,296,048 (84.3%) gas / (0xa921596a8c96c1820413b04ba3b2e4bd38ff673e2d8184530106c7a20cc4561c) in 1.286s. Peers: 71
2023-04-05 11:57:26.114-04:00 | vert.x-worker-thread-0 | INFO  | EngineNewPayload | Imported #16,983,607 / 337 tx / base fee 35.96 gwei / 29,343,421 (97.8%) gas / (0x24f7db52bdea5c9f3e4ba8b94ad1d0d9bb12444775007705deb693c1b00a1a3e) in 1.288s. Peers: 71
2023-04-05 11:57:38.292-04:00 | vert.x-worker-thread-0 | INFO  | EngineNewPayload | Imported #16,983,608 / 162 tx / base fee 40.26 gwei / 20,595,909 (68.7%) gas / (0x84a12cfa1af56904aabd4c8f83092a677c473f95fbe170271c6f42d8097438c1) in 1.103s. Peers: 71
2023-04-05 11:57:48.772-04:00 | vert.x-worker-thread-0 | INFO  | EngineNewPayload | Imported #16,983,609 / 126 tx / base fee 42.13 gwei / 9,419,915 (31.4%) gas / (0x1afe84046124cb90d769cfca4eb544fd4c75ea7566f81390bc59dcb01250fe35) in 0.584s. Peers: 71
2023-04-05 11:58:01.753-04:00 | vert.x-worker-thread-0 | INFO  | EngineNewPayload | Imported #16,983,610 / 116 tx / base fee 40.17 gwei / 10,804,207 (36.0%) gas / (0x90605b19cb84a18e935480d18e61a040ab9481522e0262b19f74059ffa99e641) in 0.617s. Peers: 71
2023-04-05 11:58:02.044-04:00 | vert.x-worker-thread-0 | INFO  | EngineForkchoiceUpdated | VALID for fork-choice-update: head: 0x90605b19cb84a18e935480d18e61a040ab9481522e0262b19f74059ffa99e641, finalized: 0x13500199ec22220a30a15f2ce98c42b0a4e1691f44d0982880d7cfac81d73046, safeBlockHash: 0x389d95e7ca8b9002e97cf5ce62520ecc0d22aa3752c8f5867c3e27857976b85b
2023-04-05 11:58:12.822-04:00 | vert.x-worker-thread-0 | INFO  | EngineNewPayload | Imported #16,983,611 / 149 tx / base fee 38.77 gwei / 13,148,715 (43.8%) gas / (0x436c11609cbedf93d460235ea92eb1d05faa2e61aad3eee76b1d1a3dc62b6da8) in 0.759s. Peers: 71
2023-04-05 11:58:26.389-04:00 | vert.x-worker-thread-0 | INFO  | EngineNewPayload | Imported #16,983,612 / 159 tx / base fee 38.17 gwei / 19,432,648 (64.8%) gas / (0xd4dd9f9fa1de73cf167d19f1bd824ade5699c3ea414707e29f9ac29ff72459b0) in 0.876s. Peers: 71
2023-04-05 11:58:36.590-04:00 | vert.x-worker-thread-0 | INFO  | EngineNewPayload | Imported #16,983,613 / 148 tx / base fee 39.58 gwei / 12,038,555 (40.1%) gas / (0x0f7c4f741ca802210d720546618de3fb342f41988113db460f8cc94514f8c44e) in 0.583s. Peers: 71
2023-04-05 11:58:49.654-04:00 | vert.x-worker-thread-0 | INFO  | EngineNewPayload | Imported #16,983,614 / 147 tx / base fee 38.60 gwei / 10,960,702 (36.5%) gas / (0x8296a5afb864af22e8bcfecb7e227b9d7339492c38b6006c2ed1614fc367a55a) in 0.603s. Peers: 71
2023-04-05 11:59:01.759-04:00 | vert.x-worker-thread-0 | INFO  | EngineNewPayload | Imported #16,983,615 / 156 tx / base fee 37.30 gwei / 13,561,518 (45.2%) gas / (0x52a3e62dd169f175d5e173af60f07131862e8df8edc5b87eae82c01d10e2d710) in 0.806s. Peers: 71
2023-04-05 11:59:02.103-04:00 | vert.x-worker-thread-0 | INFO  | EngineForkchoiceUpdated | VALID for fork-choice-update: head: 0x52a3e62dd169f175d5e173af60f07131862e8df8edc5b87eae82c01d10e2d710, finalized: 0x13500199ec22220a30a15f2ce98c42b0a4e1691f44d0982880d7cfac81d73046, safeBlockHash: 0x389d95e7ca8b9002e97cf5ce62520ecc0d22aa3752c8f5867c3e27857976b85b

Actual behavior: Besu triggers a backward sync session resulting in being out-of-sync for 6-8 blocks.

Frequency: 3-4 times roughly every 24h.

Versions (Add all that apply)

eth2 specification v1.3.0-rc.5

Nim Compiler Version 1.6.12 [Linux: amd64]


### Additional Information (Add any of the following or anything else that may be relevant)
16GB RAM using `JAVA_OPTS=-Xmx5g` and `Xplugin-rocksdb-high-spec-enabled = true`

####################################################################################################

Besu 23.1.2

Configuration:

Network: Mainnet

Network Id: 1

Data storage: Bonsai

Sync mode: Checkpoint

RPC HTTP APIs: ADMIN,ETH,NET,DEBUG,TXPOOL,WEB3

RPC HTTP port: 8545

Engine APIs: ENGINE,ETH

Engine port: 8551

High spec configuration enabled

Host:

Java: openjdk-java-17

Maximum heap size: 5.00 GB

OS: linux-x86_64

glibc: 2.36

Total memory: 15.63 GB

CPU cores: 4

####################################################################################################

non-fungible-nelson commented 1 year ago

@fab-10 maybe we are too aggressive in triggering the new BWS format.

non-fungible-nelson commented 1 year ago

Thank you for the detailed report! Also drop that max peer count ;)

ibhagwan commented 1 year ago

Thank you for the detailed report! Also drop that max peer count ;)

Ty!

It's currently set to max-peers = 100, what's the recommended value?

non-fungible-nelson commented 1 year ago

On a post-Merge network, you really only need a handful of peers and having that many will eat up CPU to maintain the connections. I'd recommend somewhere between ~10-25, especially for an in-sync node.

ibhagwan commented 1 year ago

On a post-Merge network, you really only need a handful of peers and having that many will eat up CPU to maintain the connections. I'd recommend somewhere between ~10-25, especially for an in-sync node.

That could be very helpful for my node indeed, ty!

non-fungible-nelson commented 1 year ago

can you provide the nimbus logs as well? I am trying to determine if nimbus is directing besu to resync or if this is something with our configuration of backwards sync.

ibhagwan commented 1 year ago

can you provide the nimbus logs as well? I am trying to determine if nimbus is directing besu to resync or if this is something with our configuration of backwards sync.

Unfortuntely I don't have the mainnet logs anymore as I moved back to Nethermind due to this issue (I run my services inside a tmux session so the logs aren't written to disk).

However, this is easily replicable on my stronger machine (64GB RAM) running goleri validators, here even blocks over 1s seem to trigger the backward sync, below are a few examples of the Besu log with it's corresponding nimbus logs (nothing suspecious AFAIK), lmk if you need any other data and I'll be happy to provide it.

My goerli machine:

####################################################################################################
#                                                                                                  #
# Besu 23.1.2                                                                                      #
#                                                                                                  #
# Configuration:                                                                                   #
# Network: Goerli                                                                                  #
# Network Id: 5                                                                                    #
# Data storage: Bonsai                                                                             #
# Sync mode: Snap                                                                                  #
# RPC HTTP APIs: ADMIN,ETH,NET,DEBUG,TXPOOL,WEB3                                                   #
# RPC HTTP port: 8545                                                                              #
# Engine APIs: ENGINE,ETH                                                                          #
# Engine port: 8551                                                                                #
# High spec configuration enabled                                                                  #
#                                                                                                  #
# Host:                                                                                            #
# Java: openjdk-java-17                                                                            #
# Maximum heap size: 8.00 GB                                                                       #
# OS: linux-x86_64                                                                                 #
# glibc: 2.36                                                                                      #
# Total memory: 62.78 GB                                                                           #
# CPU cores: 16                                                                                    #
#                                                                                                  #
####################################################################################################

(1): Besu, block 8,806,471 (other logs around these timestamps look totally normal)

2023-04-10 06:02:37.935-07:00 | vert.x-worker-thread-0 | INFO  | AbstractEngineNewPayload | Imported #8,806,471 / 129 tx / 16 ws / base fee 181.73 gwei / 23,372,823 (77.9%) gas / (0x57323bb8ce92bdaf76a0e01ed67cd649d37a3df2513f0cc717a1aaec576e4277) in 1.072s. Peers: 25
2023-04-10 06:04:01.233-07:00 | vert.x-worker-thread-0 | INFO  | BackwardSyncContext | Starting a new backward sync session
2023-04-10 06:04:01.564-07:00 | nioEventLoopGroup-3-7 | INFO  | BackwardSyncStep | Backward sync phase 1 of 2 completed, downloaded a total of 200 headers. Peers: 25
2023-04-10 06:04:03.039-07:00 | nioEventLoopGroup-3-7 | INFO  | BackwardSyncContext | Backward sync phase 2 of 2, 20.00% completed, imported 1 blocks of at least 5 (current head 8806472, target head 8806476). Peers: 25
2023-04-10 06:04:06.643-07:00 | ForkJoinPool.commonPool-worker-20 | INFO  | BackwardSyncContext | Backward sync phase 2 of 2 completed, imported a total of 5 blocks. Peers: 25
2023-04-10 06:04:06.643-07:00 | ForkJoinPool.commonPool-worker-20 | INFO  | BackwardSyncAlgorithm | Current backward sync session is done
2023-04-10 06:04:13.187-07:00 | vert.x-worker-thread-0 | INFO  | AbstractEngineNewPayload | Imported #8,806,477 / 70 tx / 16 ws / base fee 247.67 gwei / 6,261,496 (20.9%) gas / (0x27591f26468e4a9c9998e9e04245de0fe896cf90fb43df0daf6882b3c86bd2b8) in 0.268s. Peers: 25

(1): Nimbus slot 5385313 (same block as Besu):

INF 2023-04-10 06:02:12.001-07:00 Slot start                                 topics="beacnde" slot=5385311 epoch=168290 sync=synced peers=160 head=f6caa4d3:5385310 finalized=168288:f050bc47 delay=781us908ns
INF 2023-04-10 06:02:14.194-07:00 State replayed                             topics="chaindag" blocks=0 slots=1 current=a25f47ab:5385280@5385312 ancestor=96a67889:5385311 target=96a67889:5385311@5385312 ancestorStateRoot=f85c844b targetStateRoot=136742bb found=true assignDur=33ms390us661ns replayDur=400ms355us121ns
INF 2023-04-10 06:02:20.097-07:00 Slot end                                   topics="beacnde" slot=5385311 nextActionWait=5m3s902ms144us508ns nextAttestationSlot=5385337 nextProposalSlot=-1 syncCommitteeDuties=none head=96a67889:5385311
NOT 2023-04-10 06:02:23.911-07:00 Attestation failed to match head           topics="chaindag" epoch=168289 validator=<redacted>
INF 2023-04-10 06:02:24.002-07:00 Slot start                                 topics="beacnde" slot=5385312 epoch=168291 sync=synced peers=160 head=96a67889:5385311 finalized=168288:f050bc47 delay=2ms155us789ns
INF 2023-04-10 06:02:33.871-07:00 State replayed                             topics="chaindag" blocks=0 slots=32 current=96a67889:5385311@5385312 ancestor=96a67889:5385311@5385312 target=96a67889:5385311@5385344 ancestorStateRoot=136742bb targetStateRoot=76229e0d found=true assignDur=2us863ns replayDur=1s596ms398us898ns
INF 2023-04-10 06:02:34.382-07:00 Slot end                                   topics="beacnde" slot=5385312 nextActionWait=4m49s617ms153us890ns nextAttestationSlot=5385337 nextProposalSlot=-1 syncCommitteeDuties=none head=96a67889:5385311
INF 2023-04-10 06:02:34.393-07:00 Missed multiple heartbeats                 topics="libp2p gossipsub" heartbeat=GossipSub delay=1s232ms439us436ns hinterval=700ms
INF 2023-04-10 06:02:36.000-07:00 Slot start                                 topics="beacnde" slot=5385313 epoch=168291 sync=synced peers=160 head=96a67889:5385311 finalized=168288:f050bc47 delay=128us10ns
WRN 2023-04-10 06:02:37.809-07:00 Failed to exchange transition configuration topics="elmon" url=http://127.0.0.1:8551 err=Timeout
INF 2023-04-10 06:02:44.114-07:00 Slot end                                   topics="beacnde" slot=5385313 nextActionWait=4m39s885ms465us357ns nextAttestationSlot=5385337 nextProposalSlot=-1 syncCommitteeDuties=none head=30266594:5385313
INF 2023-04-10 06:02:48.000-07:00 Slot start                                 topics="beacnde" slot=5385314 epoch=168291 sync=synced peers=160 head=30266594:5385313 finalized=168289:86cbefc2 delay=310us471ns
INF 2023-04-10 06:02:56.101-07:00 Slot end                                   topics="beacnde" slot=5385314 nextActionWait=4m27s898ms464us549ns nextAttestationSlot=5385337 nextProposalSlot=-1 syncCommitteeDuties=none head=4bd58568:5385314
INF 2023-04-10 06:03:00.001-07:00 Slot start                                 topics="beacnde" slot=5385315 epoch=168291 sync=synced/opt peers=160 head=4bd58568:5385314 finalized=168289:86cbefc2 delay=1ms12us760ns
INF 2023-04-10 06:03:00.028-07:00 Execution client not in sync; skipping validator duties for now topics="beacval" slot=5385315 headSlot=5385314
INF 2023-04-10 06:03:00.091-07:00 Slot end                                   topics="beacnde" slot=5385315 nextActionWait=4m23s908ms42us557ns nextAttestationSlot=5385337 nextProposalSlot=-1 syncCommitteeDuties=none head=4bd58568:5385314

(2): Besu block 8,802,737

2023-04-09 14:02:14.341-07:00 | vert.x-worker-thread-0 | INFO  | AbstractEngineForkchoiceUpdated | VALID for fork-choice-update: head: 0xa68c9fccae334588bce4db5229b8a04b5bc5d4c4ede1c52966e9bd83aa9e202b, finalized: 0xbf6fab88a694ec8ea73cb75cf8433c15605a913287b3bfd4a076cc8ffc1a0457, safeBlockHash: 0x626e9c11eafb9107dfe842af338263a3f21ba9982fb4f64c7b7824246f90f913
2023-04-09 14:02:27.672-07:00 | vert.x-worker-thread-0 | INFO  | AbstractEngineNewPayload | Imported #8,802,737 / 285 tx / 16 ws / base fee 5.04 gwei / 29,973,156 (99.9%) gas / (0x7a56aed200372621f63c9b38d29702a898bef1c5c3c5da21dbfb32b4e7418bb7) in 1.385s. Peers: 25
2023-04-09 14:03:36.688-07:00 | vert.x-worker-thread-0 | INFO  | BackwardSyncContext | Starting a new backward sync session
2023-04-09 14:03:36.988-07:00 | nioEventLoopGroup-3-1 | INFO  | BackwardSyncStep | Backward sync phase 1 of 2 completed, downloaded a total of 200 headers. Peers: 25
2023-04-09 14:03:40.369-07:00 | nioEventLoopGroup-3-2 | INFO  | BackwardSyncContext | Backward sync phase 2 of 2, 20.00% completed, imported 1 blocks of at least 5 (current head 8802738, target head 8802742). Peers: 25
2023-04-09 14:03:42.694-07:00 | nioEventLoopGroup-3-2 | INFO  | BackwardSyncContext | Backward sync phase 2 of 2 completed, imported a total of 5 blocks. Peers: 25
2023-04-09 14:03:42.696-07:00 | ForkJoinPool.commonPool-worker-24 | INFO  | BackwardSyncAlgorithm | Current backward sync session is done
2023-04-09 14:03:51.060-07:00 | vert.x-worker-thread-0 | INFO  | AbstractEngineNewPayload | Imported #8,802,743 / 211 tx / 16 ws / base fee 5.11 gwei / 29,858,880 (99.5%) gas / (0x6e2f01a824d7db8cd946c419e6618af438df6b5fcb39c4702f7d9bcb9e5875b0) in 1.249s. Peers: 25
2023-04-09 14:04:26.437-07:00 | vert.x-worker-thread-0 | INFO  | AbstractEngineNewPayload | Imported #8,802,744 / 188 tx / 16 ws / base fee 5.74 gwei / 27,804,062 (92.7%) gas / (0x4647c8b12117e9d92cc1fb87ccf5410450ca9387942c8e198ce7b50b77c739c9) in 1.204s. Peers: 25

(2): Nimbus slot 5380512

INF 2023-04-09 14:08:36.000-07:00 Slot start                                 topics="beacnde" slot=5380543 epoch=168141 sync=synced peers=160 head=640ba7a9:5380542 finalized=168139:8e390e2a delay=140us448ns
INF 2023-04-09 14:08:44.075-07:00 Slot end                                   topics="beacnde" slot=5380543 nextActionWait=39s924ms968us963ns nextAttestationSlot=5380547 nextProposalSlot=-1 syncCommitteeDuties=none head=640ba7a9:5380542
INF 2023-04-09 14:08:48.001-07:00 Slot start                                 topics="beacnde" slot=5380544 epoch=168142 sync=synced peers=160 head=640ba7a9:5380542 finalized=168139:8e390e2a delay=1ms26us161ns
INF 2023-04-09 14:08:48.516-07:00 State replayed                             topics="chaindag" blocks=0 slots=2 current=294895ca:5380512@5380544 ancestor=640ba7a9:5380542 target=640ba7a9:5380542@5380544 ancestorStateRoot=8ed5c7bd targetStateRoot=053e6b01 found=true assignDur=33ms555us218ns replayDur=437ms709us426ns
INF 2023-04-09 14:08:57.713-07:00 State replayed                             topics="chaindag" blocks=0 slots=32 current=640ba7a9:5380542@5380544 ancestor=640ba7a9:5380542@5380544 target=640ba7a9:5380542@5380576 ancestorStateRoot=053e6b01 targetStateRoot=90daed42 found=true assignDur=2us893ns replayDur=1s640ms752us963ns
INF 2023-04-09 14:08:58.228-07:00 Slot end                                   topics="beacnde" slot=5380544 nextActionWait=25s771ms79us297ns nextAttestationSlot=5380547 nextProposalSlot=-1 syncCommitteeDuties=none head=640ba7a9:5380542
INF 2023-04-09 14:08:58.240-07:00 Missed multiple heartbeats                 topics="libp2p gossipsub" heartbeat=GossipSub delay=1s277ms511us46ns hinterval=700ms
INF 2023-04-09 14:09:00.007-07:00 Slot start                                 topics="beacnde" slot=5380545 epoch=168142 sync=synced peers=160 head=640ba7a9:5380542 finalized=168139:8e390e2a delay=7ms408us720ns
NOT 2023-04-09 14:09:03.482-07:00 Attestation failed to match head           topics="chaindag" epoch=168140 validator=<redacted>
INF 2023-04-09 14:09:03.524-07:00 State replayed                             topics="chaindag" blocks=0 slots=2 current=640ba7a9:5380542@5380543 ancestor=640ba7a9:5380542@5380543 target=640ba7a9:5380542@5380545 ancestorStateRoot=ac7356de targetStateRoot=2e016ed6 found=true assignDur=3us154ns replayDur=949ms8us648ns
INF 2023-04-09 14:09:08.260-07:00 Slot end                                   topics="beacnde" slot=5380545 nextActionWait=15s739ms310us276ns nextAttestationSlot=5380547 nextProposalSlot=-1 syncCommitteeDuties=none head=8054a3dd:5380545
INF 2023-04-09 14:09:12.001-07:00 Slot start                                 topics="beacnde" slot=5380546 epoch=168142 sync=synced peers=160 head=8054a3dd:5380545 finalized=168140:ad2fed16 delay=991us809ns
INF 2023-04-09 14:09:20.106-07:00 Slot end                                   topics="beacnde" slot=5380546 nextActionWait=3s893ms243us122ns nextAttestationSlot=5380547 nextProposalSlot=-1 syncCommitteeDuties=none head=200a7e6b:5380546

(3): Besu block 8,798,617

2023-04-08 20:19:37.818-07:00 | vert.x-worker-thread-0 | INFO  | AbstractEngineNewPayload | Imported #8,798,616 / 133 tx / 16 ws / base fee 100.20 gwei / 8,857,883 (29.5%) gas / (0x8b63a3a8c508a0e556e419e168cae94664180b2e9a3bc482d5918aa5f2a28012) in 0.417s. Peers: 25
2023-04-08 20:19:51.794-07:00 | vert.x-worker-thread-0 | INFO  | AbstractEngineNewPayload | Imported #8,798,617 / 161 tx / 16 ws / base fee 95.07 gwei / 29,996,593 (100.0%) gas / (0x777fc03d53f438a318716557122ad7b927986f268f5eb3e6ef59ad29d037f07d) in 1.454s. Peers: 25
2023-04-08 20:19:53.364-07:00 | vert.x-worker-thread-0 | INFO  | AbstractEngineForkchoiceUpdated | VALID for fork-choice-update: head: 0x777fc03d53f438a318716557122ad7b927986f268f5eb3e6ef59ad29d037f07d, finalized: 0xd202d7ce53811b02ee2f6323a10611fb1c63feb656db46692945a7b8ad0311c6, safeBlockHash: 0x5938a234202a46bc467ac6c7eb878d1270ea33785caf94382ea3352a7f65e108
2023-04-08 20:21:01.603-07:00 | vert.x-worker-thread-0 | INFO  | BackwardSyncContext | Starting a new backward sync session
2023-04-08 20:21:02.238-07:00 | nioEventLoopGroup-3-8 | INFO  | BackwardSyncStep | Backward sync phase 1 of 2 completed, downloaded a total of 192 headers. Peers: 25
2023-04-08 20:21:03.432-07:00 | nioEventLoopGroup-3-8 | INFO  | BackwardSyncContext | Backward sync phase 2 of 2, 20.00% completed, imported 1 blocks of at least 5 (current head 8798618, target head 8798622). Peers: 25
2023-04-08 20:21:06.826-07:00 | nioEventLoopGroup-3-8 | INFO  | BackwardSyncContext | Backward sync phase 2 of 2 completed, imported a total of 5 blocks. Peers: 25
2023-04-08 20:21:06.829-07:00 | ForkJoinPool.commonPool-worker-5 | INFO  | BackwardSyncAlgorithm | Current backward sync session is done
2023-04-08 20:21:38.597-07:00 | vert.x-worker-thread-0 | INFO  | AbstractEngineNewPayload | Imported #8,798,623 / 247 tx / 16 ws / base fee 103.18 gwei / 29,998,288 (100.0%) gas / (0xa1ec4480243faaa12f8f30858aa81d6121fb37747020e3d343e5b5c73dd9769f) in 1.410s. Peers: 25
2023-04-08 20:21:50.176-07:00 | vert.x-worker-thread-0 | INFO  | AbstractEngineNewPayload | Imported #8,798,624 / 108 tx / 16 ws / base fee 116.08 gwei / 16,688,362 (55.6%) gas / (0x47dab1c059e324caaae6bf17b355d12a72632e94b17fb89206e2e52771026795) in 0.628s. Peers: 25
2023-04-08 20:22:02.319-07:00 | vert.x-worker-thread-0 | INFO  | AbstractEngineNewPayload | Imported #8,798,625 / 78 tx / 16 ws / base fee 117.71 gwei / 11,499,324 (38.3%) gas / (0xfcc3bd7f4d549bc62190aee3f75ad835f671215ea7ad2cec90bcc551ed248036) in 0.482s. Peers: 25

(3): Nimbus slot 5375199

INF 2023-04-08 20:19:36.000-07:00 Slot start                                 topics="beacnde" slot=5375198 epoch=167974 sync=synced peers=159 head=f947b406:5375197 finalized=167972:5ee2a2bd delay=283us502ns
INF 2023-04-08 20:19:44.106-07:00 Slot end                                   topics="beacnde" slot=5375198 nextActionWait=27s893ms671us768ns nextAttestationSlot=5375201 nextProposalSlot=-1 syncCommitteeDuties=none head=5da2ff83:5375198
INF 2023-04-08 20:19:48.000-07:00 Slot start                                 topics="beacnde" slot=5375199 epoch=167974 sync=synced peers=158 head=5da2ff83:5375198 finalized=167972:5ee2a2bd delay=894us991ns
WRN 2023-04-08 20:19:51.405-07:00 Failed to exchange transition configuration topics="elmon" url=http://127.0.0.1:8551 err=Timeout
INF 2023-04-08 20:19:52.655-07:00 State replayed                             topics="chaindag" blocks=0 slots=1 current=0bb13188:5375168@5375200 ancestor=7397aa62:5375199 target=7397aa62:5375199@5375200 ancestorStateRoot=2aa3ecf9 targetStateRoot=8bc483fd found=true assignDur=33ms528us672ns replayDur=407ms898us623ns
INF 2023-04-08 20:19:56.102-07:00 Slot end                                   topics="beacnde" slot=5375199 nextActionWait=15s897ms66us236ns nextAttestationSlot=5375201 nextProposalSlot=-1 syncCommitteeDuties=none head=7397aa62:5375199
INF 2023-04-08 20:20:00.000-07:00 Slot start                                 topics="beacnde" slot=5375200 epoch=167975 sync=synced peers=159 head=7397aa62:5375199 finalized=167972:5ee2a2bd delay=267us296ns
INF 2023-04-08 20:20:08.322-07:00 Slot end                                   topics="beacnde" slot=5375200 nextActionWait=3s677ms92us22ns nextAttestationSlot=5375201 nextProposalSlot=-1 syncCommitteeDuties=none head=abf39360:5375200
INF 2023-04-08 20:20:12.000-07:00 Slot start                                 topics="beacnde" slot=5375201 epoch=167975 sync=synced/opt peers=158 head=abf39360:5375200 finalized=167973:e2ce79bc delay=188us16ns
INF 2023-04-08 20:20:12.027-07:00 Execution client not in sync; skipping validator duties for now topics="beacval" slot=5375201 headSlot=5375200
INF 2023-04-08 20:20:12.091-07:00 Slot end                                   topics="beacnde" slot=5375201 nextActionWait=2m59s908ms810us448ns nextAttestationSlot=5375216 nextProposalSlot=-1 syncCommitteeDuties=none head=abf39360:5375200
INF 2023-04-08 20:20:24.001-07:00 Slot start                                 topics="beacnde" slot=5375202 epoch=167975 sync=synced/opt peers=159 head=abf39360:5375200 finalized=167973:e2ce79bc delay=1ms23us910ns
INF 2023-04-08 20:20:24.028-07:00 Execution client not in sync; skipping validator duties for now topics="beacval" slot=5375202 headSlot=5375200
INF 2023-04-08 20:20:24.091-07:00 Slot end                                   topics="beacnde" slot=5375202 nextActionWait=2m47s908ms946us775ns nextAttestationSlot=5375216 nextProposalSlot=-1 syncCommitteeDuties=none head=abf39360:5375200
non-fungible-nelson commented 1 year ago

Thanks for this report - we have noticed this on our own nodes on sepolia and will work to debug (or blame on nimbus ;0)

ibhagwan commented 1 year ago

@non-fungible-nelson, siince I upgraded my goerli node to v23.4.0 (just about when it came out) I didn't have a single "Starting a new backward sync session" message aside from the initial sync when starting the service - was anything changed in the new version that should've affected this issue?

As of now, this issue seems to be solved.

non-fungible-nelson commented 1 year ago

this is a problem with the way sync is handled between besu and certain CLs - I am going to keep this open so we can track against our new post-Merge sync strategy.

ibhagwan commented 1 year ago

this is a problem with the way sync is handled between besu and certain CLs - I am going to keep this open so we can track against our new post-Merge sync strategy.

Another data point for you @non-fungible-nelson, after seeing the new version works great on my Goerli node I migrated back my mainnet node from nethermind to besu and I'm experiencing the same great performance, hope I'm not jinxing it but it's been over 2 days so far and not a single "new backward sync" message :-)

From my end, this issue is definitely solved, attestation eff is also hovering between 97-99% which is great!