Closed howardrcc closed 1 year ago
I experienced same issue with 22.4.4. There are some fixes to mitigate this issue included in the 22.7.0-rc2 release of Besu. I discarded the database and retesting the sync from scratch using 22.7.0-rc2. Note when you see the 'World State Root does not match expected value' exception, your database is corrupted and cannot be recovered.
Hope you can do the same. The more testing the better.
Thanks for the tips. I tried 22.7.0-snapshot (latest) and the db got corrupted again.... Gonna try -rc2 now. funny thing it kind of synced (somewhere until 15.200.000) . Then I saw number of blocks behind going up again (in grafana). Restarted besu then the number of blocks starting to decrease again. 15 minutes later i get the world state root not match error. I think that at some point it switched from snap sync to full sync in the logs. Don't know if that's a clue but I'll guess we'll know more soon.
after this I'm switching to fastsync :-)
@howardrcc Did you delete your data fir and resync from empty with 22.7? That would be required.
I did delete the data dir and also deleted after trying the (non-rc2) 22.7.0 snapshot. It's still syncing for a day or two I think. I'll report back in a few days.
another fail/corrupted when database was more or less synced. Found out the db got corrupted while grafana reporting it was running about 12000 blocks behind. So it's seems to be happening when all blocks are imported and when new blocks/ the worldstate is being calculated or retrieved.
closing as fixed by #4409
I would like to reopen this issue. It cropped up today at around 12:29. Besu v23.10.2 running on a Debian machine, 32GB RAM, 536GB free out of 1.8T SSD, validating for about a year without problems. To free up space I have resynced with the checkpoint option (see the attached sync.log) sync.log
Everything was going fine until today when I got missed attestation messages and checked the PC. This is the log from today (with quite a few lines delete, since the file would otherwise be too big) besu.log
I will now reboot the machine and, if it does not work, will delete the Besu data and resync. Will keep you updated.
This is the besu.service
file:
[Unit]
Description=Besu Execution Client (Mainnet)
Wants=network-online.target
After=network-online.target
[Service]
User=eth1
Group=eth1
Type=simple
Restart=always
RestartSec=5
TimeoutStopSec=180
Environment="JAVA_OPTS=-Xmx8g"
ExecStart=/srv/besu/besu/bin/besu \
--network=mainnet \
--sync-mode=X_CHECKPOINT \
--data-path=/mnt/rpdata/besu_data \
--data-storage-format=BONSAI \
--engine-jwt-secret=/secrets/jwtsecret \
--rpc-http-enabled=true \
--rpc-http-port=8545 \
--max-peers=40 \
--metrics-enabled \
--Xplugin-rocksdb-high-spec-enabled
[Install]
WantedBy=multi-user.target
UPDATE1:
Reboot does not fix it. post_reboot.log
I also got this error at around 11:30 GMT today. Reboot did not fix
I forgot to mention that my localtime is CET, thus GMT+1. So apparently it happened more or less at the same time. Which was your last successfully synced block? Mine is apparently 18,947,892
Jan 06 12:29:13 rp besu[2257214]: 2024-01-06 12:29:13.134+01:00 | vert.x-worker-thread-0 | INFO | AbstractEngineNewPayload | Imported #18,947,891 / 125 tx / 16 ws / base fee 12.21 gwei / 11,071,834 (36.9%) gas / (0xa
9ddaf079c9b5ed5baece459d7b7e2866983566bbb08f92c3134798eb0e67597) in 0.333s. Peers: 40
Jan 06 12:29:26 rp besu[2257214]: 2024-01-06 12:29:26.246+01:00 | vert.x-worker-thread-0 | INFO | AbstractEngineNewPayload | Imported #18,947,892 / 187 tx / 16 ws / base fee 11.81 gwei / 17,562,696 (58.5%) gas / (0xc
1141d53490d3b8ba41c88ef6db4256fbddc77db8cbb03afaeb5b94e52ee42f3) in 0.496s. Peers: 40
Jan 06 12:29:36 rp besu[2257214]: 2024-01-06 12:29:36.735+01:00 | vert.x-worker-thread-0 | ERROR | AbstractBlockProcessor | failed persisting block
Jan 06 12:29:36 rp besu[2257214]: java.lang.RuntimeException: World State Root does not match expected value, header 0xf9029a6ce0a53e912643642e3458967dd2e38edd60d77e312156d8b1c432a433 calculated 0xf26bfa5c260e327582633c0c77d8dbe900a4877ab57e067ec814acd81d4b98ba
Same block for me. Devs in the discord say they are working on it.
Invalid block 18947893 (0xb41fc83658a61504771fa9904d67f89b57c531cb902335358ff94c0680f05f07): Optional[World State Root does not match expected value, header 0xf9029a6ce0a53e912643642e3458967dd2e38edd60d77e312156d8b1c432a433 calculated 0xf26bfa5c260e327582633c0c77d8dbe900a4877ab57e067ec814acd81d4b98ba]
I have the same issue and at the same time as @JanKalin . Just started this morning, but maybe it has been going on longer until I ran out of free credits on infura backup. I'm not as tech savvy, but I can post the same info JanKalin did above.
besu service file:
[Unit]
Description=Besu Execution Client (Mainnet)
Wants=network-online.target
After=network-online.target
[Service]
User=besu
Group=besu
Type=simple
Restart=always
RestartSec=5
ExecStart=/usr/local/bin/besu/bin/besu \
--network=mainnet \
--sync-mode=X_SNAP \
--data-path=/var/lib/besu \
--data-storage-format=BONSAI \
--engine-jwt-secret=/var/lib/jwtsecret/jwt.hex \
--metrics-enabled
[Install]
WantedBy=multi-user.target
And... I'll wait to supply more info since a lot of people seem to have the same problem.
More discussion of the recent issue in https://github.com/hyperledger/besu/issues/6357
Could you try to upgrade to the version 23.10.3 ?
I upgraded from 23.10.1 to pre-release 23.10.3 and am currently doing the backward sync. We'll see if it completes.
Backward sync worked! besu is up and running after the upgrade to 23.10.3
I recommend to follow this https://discord.com/channels/905194001349627914/938504958909747250/1193223408058630295 for people having the problem
I recommend to follow this https://discord.com/channels/905194001349627914/938504958909747250/1193223408058630295 for people having the problem
This link doesn't go anywhere, I assume because I'm not in the server it was posted? Can you share whatever is in there here in the Github discussion?
Found the solution I think in the Ethstaker Discord. (Link to message.)
Besu world state issue recap.
Around
2024-01-06T11:29:36 UTC
, Besu started reporting errors like this one:World State Root does not match expected value, header 0xf9029a6ce0a53e912643642e3458967dd2e38edd60d77e312156d8b1c432a433 calculated 0xf26bfa5c260e327582633c0c77d8dbe900a4877ab57e067ec814acd81d4b98ba followed by many Invalid new payload messages, with the effect that Besu is not in sync and the CL client is stuck too and not able to publish attestations or blocks.
The cause of this issue are still under investigation, but after collecting feedback from user and testing some options, there are some workarounds to recover you node, until a proper fix is released.
Recovering options: If you are still running Besu with version 23.10.2 or lower, then upgrade to 23.10.3 https://github.com/hyperledger/besu/releases/tag/untagged-8a32de7a7280bb003d79 If you are already on version 23.10.3, then keep Besu running, and operate on your CL client, removing its beacon db and restarting, this will trigger a backward sync in Besu that could help healing the worldstate. Instructions on how to delete the beacon db, depends on your client, for example for Teku you need to remove the
beacon
folder in the Teku data path, for other client refer to their documentation. If after the backward sync session the issue is still there then try point 3. If that fails, try the same process again, but force Besu to set its head back to a block prior to the problem block. This will require API access, and for this to work you have to enabled the DEBUG API:
--rpc-http-api=ETH,NET,WEB3,DEBUG
see https://besu.hyperledger.org/public-networks/reference/cli/options#rpc-http-api for more details, changes to that param will require Besu to be restarted.
Once Besu is up and you have access to DEBUG apis, you can shutdown your CL client and clear its DB. With Besu still up, set its head back to block 18947892 (0x1211F34 in hex) using the following API call:
curl -X POST --data '{"jsonrpc":"2.0","method":"debug_setHead","params":["0x1211F34"],"id":1}' http://localhost:8545
Then restart your CL. Your CL should optimistically sync the beacon chain using a checkpoint from the network, and this will trigger a backward sync in Besu.If after the backward sync session the issue is still there then try point 4.
If previous options have not worked then, you can try to resync only the world state, it can takes some hours, but it is faster from a resync from scratch
curl -X POST --data '{"jsonrpc":"2.0","method":"debug_resyncWorldState","params":[],"id":1}' http://localhost:8545
We have just created a hotfix. Even if your node works well it is better to update Besu to avoid future issues. If your node already works well, no need to do anything else, just an update and use hotfix
Thanks to devs for quick action! BTW, I can attest (pun intended) that a full resync of the chain works with v23.10.3, but will update to hotfix version.
Description
Hi there. I successfully synced to mainnet a few day using snapsync. I am getting Errors now. Here is a snippet of the log (I'm not seeing other usefull messages in TRACE/DEBUG). I'm not sure what is happing. I tried restart and upgrading besu as suggested in other Issues but thats not working for me.
Any suggestions?
Versions (Add all that apply)
Software version: [
besu --version [22.4.4] and tried 22.7.0 snapshot
]Java version: [
java -version openjdk 11
]OS Name & Version: [
NixOS 22.11
]Kernel Version: [
Linux home 5.15.49
]Docker Version: [
docker version 20.10.17
]Additional Information (Add any of the following or anything else that may be relevant)
Besu setup info - Mainnet/Snapsync Datastorage: Bonsai, using docker
System info - 16GB, I5-3570K