Open tmeinlschmidt opened 6 months ago
Identical problem on mine.
Feb 21 19:03:26 erigon[16423]: [INFO] [02-21|19:03:26.435] RPC Daemon notified of new headers from=19279473 to=19279474 hash=0xb5aa8822e5f62e54e2d9b110fe360e358711ead6346b05504c08d521ce4a19b9 header sending=15.913µs log sending=257ns
Feb 21 19:03:26 erigon[16423]: [INFO] [02-21|19:03:26.436] head updated hash=0xb5aa8822e5f62e54e2d9b110fe360e358711ead6346b05504c08d521ce4a19b9 number=19279474
Feb 21 19:03:26 erigon[16423]: [INFO] [02-21|19:03:26.485] [snapshots] Prune Blocks to=19150000 limit=100
Feb 21 19:03:26 erigon[16423]: [WARN] [02-21|19:03:26.486] [snapshots] retire blocks err="DumpBlocks: DumpBodies: header missed in db: block_num=19150000, hash=9c87e585613f7ea1ebd89ec92a7bdc7e2ceaa5f8843c4237aaa8b44030532f19"
Feb 21 19:03:37 erigon[16423]: [INFO] [02-21|19:03:37.591] [NewPayload] Handling new payload height=19279475 hash=0xf9df32dda0808cdbd94652bd036887463d3d1d6c2b41540a92b01d9b1694c94b
Feb 21 19:03:41 erigon[16423]: [INFO] [02-21|19:03:41.866] [updateForkchoice] Fork choice update: flushing in-memory state (built by previous newPayload)
Feb 21 19:03:42 erigon[16423]: [INFO] [02-21|19:03:42.201] RPC Daemon notified of new headers from=19279474 to=19279475 hash=0xf9df32dda0808cdbd94652bd036887463d3d1d6c2b41540a92b01d9b1694c94b header sending=16.059µs log sending=493ns
Feb 21 19:03:42 erigon[16423]: [INFO] [02-21|19:03:42.202] head updated hash=0xf9df32dda0808cdbd94652bd036887463d3d1d6c2b41540a92b01d9b1694c94b number=19279475
Feb 21 19:03:42 erigon[16423]: [INFO] [02-21|19:03:42.332] [snapshots] Prune Blocks to=19150000 limit=100
Feb 21 19:03:42 erigon[16423]: [WARN] [02-21|19:03:42.333] [snapshots] retire blocks err="DumpBlocks: DumpBodies: header missed in db: block_num=19150000, hash=9c87e585613f7ea1ebd89ec92a7bdc7e2ceaa5f8843c4237aaa8b44030532f19"
Feb 21 19:03:48 erigon[16423]: [INFO] [02-21|19:03:48.996] [NewPayload] Handling new payload height=19279476 hash=0xe88d3373a340cbb6c951f1f9d75441fc136aa7d61429854d69871607146e4f98
Feb 21 19:03:50 erigon[16423]: [INFO] [02-21|19:03:50.574] [updateForkchoice] Fork choice update: flushing in-memory state (built by previous newPayload)
Feb 21 19:03:50 erigon[16423]: [INFO] [02-21|19:03:50.806] RPC Daemon notified of new headers from=19279475 to=19279476 hash=0xe88d3373a340cbb6c951f1f9d75441fc136aa7d61429854d69871607146e4f98 header sending=16.048µs log sending=211ns
Feb 21 19:03:50 erigon[16423]: [INFO] [02-21|19:03:50.806] head updated hash=0xe88d3373a340cbb6c951f1f9d75441fc136aa7d61429854d69871607146e4f98 number=19279476
Feb 21 19:03:50 erigon[16423]: [INFO] [02-21|19:03:50.884] [snapshots] Prune Blocks to=19150000 limit=100
Feb 21 19:03:50 erigon[16423]: [WARN] [02-21|19:03:50.885] [snapshots] retire blocks err="DumpBlocks: DumpBodies: header missed in db: block_num=19150000, hash=9c87e585613f7ea1ebd89ec92a7bdc7e2ceaa5f8843c4237aaa8b44030532f19"
Feb 21 19:04:01 erigon[16423]: [INFO] [02-21|19:04:01.428] [NewPayload] Handling new payload height=19279477 hash=0x8469b4deb0bf718ceae415fcc9557ab14cf744ef98c68d383721adda7cd02593
Erigon version: erigon version 2.58.0-9731f716
temporary fix: --snap.stop permanent fix: integration stage_header --reset
interesting is, we have two nodes and this affects only one one of them. Both the same data, the same version. Weird.
same problem in my node also i have "can't build missed indices" error.
integration stage_header --reset doesn't fix problem.
Erigon Version: 2.58.1
Logs:
INFO[02-23|00:04:40.528] [1/12 Snapshots] Requesting downloads
INFO[02-23|00:04:42.225] [snapshots:download] Blocks Stat blocks=19150k indices=19150k alloc=4.7GB sys=5.6GB
INFO[02-23|00:04:42.225] [snapshots:missed-idx] Blocks Stat blocks=19150k indices=19150k alloc=4.7GB sys=5.6GB
EROR[02-23|00:04:42.782] Could not start execution service err="[1/12 Snapshots] can't build missed indices: TransactionsIdx: at=19150000-19160000, pre index building, expect: 1494586, got 7297745"
INFO[02-23|00:05:03.608] [Sentinel] Sentinel started app=caplin enr=enr:-Ly4QEczJkMibOYHW1rG2JP6WMCQT2l4kUoogbB55cIOseQSB3kgQ6ZSQpaMO_j8-yAR5qgAjaaUmQGt4HT0rcA6JIMBh2F0dG5ldHOIAAAAAAAAAACEZXRoMpC7pNqWBAAAAAAAAAAABB0AgmlkgnY0gmlwhH8AAAGJc2VjcDI1NmsxoQNaDSrBk_V6BSTZn7J1_ZV8-pzmeV-33HgUthW0umdJtohzeW5jbmV0cwCDdGNwgg-hg3VkcIIPoA
INFO[02-23|00:05:03.608] Static peers len=0
INFO[02-23|00:05:03.608] Started Ethereum 2.0 Gossip Service app=caplin
INFO[02-23|00:05:28.806] [Caplin] starting clstages loop app=caplin
INFO[02-23|00:05:28.806] Starting downloading History app=caplin stage=DownloadHistoricalBlocks from=8485120
INFO[02-23|00:05:30.033] Backfilling finished
INFO[02-23|00:05:33.811] Ready to insert history, waiting for sync cycle to finish app=caplin stage=DownloadHistoricalBlocks
INFO[02-23|00:05:33.815] [Caplin] Forward Sync app=caplin stage=ForwardSync from=8485120 to=8485225
INFO[02-23|00:05:48.839] Adding light client update period=1035
INFO[02-23|00:06:03.608] P2P app=caplin peers=30
INFO[02-23|00:06:06.055] [4/12 Execution] Completed on block=19286611
INFO[02-23|00:06:18.251] RPC Daemon notified of new headers from=19286596 to=19286611 hash=0xccda8961b36855dc8354cbdfa796bde43a9664a3f46f71facac2a6a51a6a019e header sending=32.13µs log sending=220ns
INFO[02-23|00:06:18.251] head updated hash=0xccda8961b36855dc8354cbdfa796bde43a9664a3f46f71facac2a6a51a6a019e number=19286611
INFO[02-23|00:06:18.251] [snapshots] Prune Blocks to=19150000 limit=100
WARN[02-23|00:06:18.252] [snapshots] retire blocks err="DumpBlocks: DumpBodies: header missed in db: block_num=19150000, hash=9c87e585613f7ea1ebd89ec92a7bdc7e2ceaa5f8843c4237aaa8b44030532f19"
INFO[02-23|00:06:43.830] [4/12 Execution] Completed on block=19286614
INFO[02-23|00:06:46.528] RPC Daemon notified of new headers from=19286611 to=19286614 hash=0xcd98e735fce6d47e68c96986c67e01968fefac9815d3b129c04fd19a14d4f7ae header sending=38.012µs log sending=221ns
INFO[02-23|00:06:46.528] head updated hash=0xcd98e735fce6d47e68c96986c67e01968fefac9815d3b129c04fd19a14d4f7ae number=19286614
INFO[02-23|00:06:46.529] [snapshots] Prune Blocks to=19150000 limit=100
WARN[02-23|00:06:46.530] [snapshots] retire blocks err="DumpBlocks: DumpBodies: header missed in db: block_num=19150000, hash=9c87e585613f7ea1ebd89ec92a7bdc7e2ceaa5f8843c4237aaa8b44030532f19"
INFO[02-23|00:07:03.609] P2P app=caplin peers=14
INFO[02-23|00:07:25.947] [p2p] GoodPeers eth68=3 eth66=2 eth67=6
INFO[02-23|00:07:36.540] [txpool] stat pending=729 baseFee=506 queued=2631 alloc=10.1GB sys=12.3GB
INFO[02-23|00:08:03.614] P2P app=caplin peers=17
INFO[02-23|00:09:03.608] P2P app=caplin peers=22
INFO[02-23|00:09:10.881] [4/12 Execution] Completed on block=19286626
I've noticed we get this issue pretty consistently on Goerli when stopping, starting a node. For example on stopping we see this WARN
:
Feb 24 05:19:28 ip-10-0-65-9 erigon[9174]: [WARN] [02-24|05:19:28.340] [snapshots] retire blocks err="DumpBlocks: compress: context canceled"
If we see that warning, then the next startup we always see:
Feb 24 05:38:21 ip-10-0-65-9 erigon[57376]: [WARN] [02-24|05:38:21.672] [1/12 Snapshots] Some blocks are not in snapshots and not in db max_in_snapshots=8499999 min_in_db=10446000
Feb 24 05:38:21 ip-10-0-65-9 erigon[57376]: [INFO] [02-24|05:38:21.672] [snapshots:download] Blocks Stat blocks=8500k indices=8500k alloc=1.1GB sys=1.7GB
Feb 24 05:38:21 ip-10-0-65-9 erigon[57376]: [INFO] [02-24|05:38:21.678] [snapshots] Prune Blocks to=8500000 limit=100
Feb 24 05:38:21 ip-10-0-65-9 erigon[57376]: [WARN] [02-24|05:38:21.679] [snapshots] retire blocks err="DumpBlocks: DumpBodies: header missed in db: block_num=8500000>
Permanent fix:
integration stage_header --reset
From https://github.com/ledgerwatch/erigon/issues/9479#issuecomment-1958530855
This indeed worked for me. It took way longer than I expected—about 5 days, on a non-networked (local) NVMe SSD—but it does appear to have succeeded.
found 1 reason of it https://github.com/ledgerwatch/erigon/pull/9555 If there is a gap in snapshots (for example if file manually deleted): erigon will ignore all snapshots after gap and will try to create snapshots starting from block "before gap" and of-course doesn't have such blocks (erigon already created older snapshots in the past and removed such block). So, there is no bug in prune/retirement code of blocks, but just gap in snap files. I will print it to debug logs now and maybe we will do something else about it in future.
I had the same issue.
Using the -snap.stop
temporary fix seemed to work.
Thank you @AskAlexSharov for your suggestions and also a great project 🙌
System information
Erigon version:
erigon version 2.58.0-9731f716
OS & Version: Linux
Commit hash: 9731f716
Erigon Command (with flags/config):
Consensus Layer: prysm
Consensus Layer Command (with flags/config):
Chain/Network: mainnet
Expected behaviour
no errors in log
Actual behaviour
DumpBlocks: DumpBodies: header missed in db: block_num=19000000