erigontech / erigon

Ethereum implementation on the efficiency frontier https://erigon.gitbook.io
GNU Lesser General Public License v3.0
3.14k stars 1.12k forks source link

Panic while syncing a new sepolia node #11742

Closed wmitsuda closed 2 months ago

wmitsuda commented 2 months ago

commit info:

[INFO] [08-24|18:21:31.365] Build info                               git_branch=main git_tag=v3.0.0-alpha2-140-g85c35dc315 git_commit=85c35dc3
1524b812e5fce9361f69a0a2b5e40dde

I started a new sync yesterday, it panic'ed as soon as the download finished:

[INFO] [08-25|11:07:31.018] [mem] memory stats                       Rss=27.0GB Size=0B Pss=27.0GB SharedClean=2.7MB SharedDirty=0B PrivateCle
an=22.3GB PrivateDirty=4.7GB Referenced=12.7GB Anonymous=4.6GB Swap=505.1MB alloc=3.2GB sys=8.8GB
[INFO] [08-25|11:07:33.087] P2P                                      app=caplin peers=6
[INFO] [08-25|11:07:50.785] [1/6 OtterSync] downloading header-chain progress="(438/438 files) 99.99% 457.8GB/457.8GB" download-rate=13.9MB/s
time-left=0hrs:0m total-time=15h3m20s flush=14.0MB/s hash=13.9MB/s complete=14.0MB/s upload=0B/s peers=0 files=438 no-metadata=0 connections=0
 alloc=3.3GB sys=8.8GB
[INFO] [08-25|11:08:10.785] [1/6 OtterSync] header-chain download finished time=15h3m40.000626839s
[INFO] [08-25|11:08:23.034] [Caplin-Blocks] Flushed buffer file      name=erigon-sortable-buf-3019815522
[INFO] [08-25|11:08:33.088] P2P                                      app=caplin peers=5
[INFO] [08-25|11:09:33.087] P2P                                      app=caplin peers=6
[EROR] [08-25|11:10:13.053] [txpool] process batch remote txs        err="panic: runtime error: index out of range [9297754086660752] with len
gth 1\n[pool.go:519 panic.go:785 panic.go:121 elias_fano.go:171 elias_fano.go:187 btree_index.go:926 bps_tree.go:385 btree_index.go:1016 domai
n.go:764 domain.go:1447 domain.go:1649 aggregator.go:2037 kv_temporal.go:208 dummy.go:44 dummy.go:69 pool.go:2420 pool.go:936 pool.go:1102 poo
l.go:553 pool.go:1798 asm_amd64.s:1700]"
[EROR] [08-25|11:10:13.144] [txpool] process batch remote txs        err="panic: runtime error: index out of range [9297754086660752] with len
gth 1\n[pool.go:519 panic.go:785 panic.go:121 elias_fano.go:171 elias_fano.go:187 btree_index.go:926 bps_tree.go:385 btree_index.go:1016 domai
n.go:764 domain.go:1447 domain.go:1649 aggregator.go:2037 kv_temporal.go:208 dummy.go:44 dummy.go:69 pool.go:2420 pool.go:936 pool.go:1102 poo
l.go:553 pool.go:1798 asm_amd64.s:1700]"
[EROR] [08-25|11:10:13.244] [txpool] process batch remote txs        err="panic: runtime error: index out of range [9297754086660752] with len
gth 1\n[pool.go:519 panic.go:785 panic.go:121 elias_fano.go:171 elias_fano.go:187 btree_index.go:926 bps_tree.go:385 btree_index.go:1016 domai
n.go:764 domain.go:1447 domain.go:1649 aggregator.go:2037 kv_temporal.go:208 dummy.go:44 dummy.go:69 pool.go:2420 pool.go:936 pool.go:1102 poo
l.go:553 pool.go:1798 asm_amd64.s:1700]"

that error repeated thousand of times in the log (the same error):

[EROR] [08-25|11:54:52.344] [txpool] process batch remote txs        err="panic: runtime error: index out of range [9297754086660752] with len
gth 1\n[pool.go:519 panic.go:785 panic.go:121 elias_fano.go:171 elias_fano.go:187 btree_index.go:926 bps_tree.go:385 btree_index.go:1016 domai
n.go:764 domain.go:1447 domain.go:1649 aggregator.go:2037 kv_temporal.go:208 dummy.go:44 dummy.go:69 pool.go:2420 pool.go:936 pool.go:1102 poo
l.go:553 pool.go:1798 asm_amd64.s:1700]"
[EROR] [08-25|11:54:52.444] [txpool] process batch remote txs        err="panic: runtime error: index out of range [9297754086660752] with len
gth 1\n[pool.go:519 panic.go:785 panic.go:121 elias_fano.go:171 elias_fano.go:187 btree_index.go:926 bps_tree.go:385 btree_index.go:1016 domai
n.go:764 domain.go:1447 domain.go:1649 aggregator.go:2037 kv_temporal.go:208 dummy.go:44 dummy.go:69 pool.go:2420 pool.go:936 pool.go:1102 poo
l.go:553 pool.go:1798 asm_amd64.s:1700]"
[INFO] [08-25|11:54:52.464] [snapshots:download] Stat                blocks=6.52M indices=6.52M alloc=2.3GB sys=9.4GB
[INFO] [08-25|11:54:52.481] [snapshots:history] Stat                 blocks=6.52M txs=290.62M txNum2blockNum="128=5623K,160=6107K,176=6359K,18
4=6490K,186=6524K" first_history_idx_in_db=0 last_comitment_block=6524376 last_comitment_tx_num=290625000 alloc=2.3GB sys=9.4GB
[INFO] [08-25|11:54:52.483] [1/6 OtterSync] DONE                     in=15h50m22.295721381s block=6527999
[INFO] [08-25|11:54:52.487] [4/6 Execution] starting                 from=6524376 to=6527999 fromTxNum=290624956 offsetFromBlockBeginning=43 i
nitialCycle=true useExternalTx=false
[INFO] [08-25|11:54:52.504] [4/6 Execution] Done                     blk=0 blks=18446744073703027241 blk/s=1268685661638210093056.0 txs=3 tx/s
=206 gas/s=9.50M buf=560B/512.0MB stepsInDB=0.00 step=186.0 alloc=2.3GB sys=9.4GB
panic: runtime error: index out of range [269203635499850097] with length 1                                                11:54:52 [47/46334]

goroutine 12899 [running]:
github.com/erigontech/erigon-lib/recsplit/eliasfano32.(*EliasFano).get(0x0?, 0x4afb01d81c0d?)
        github.com/erigontech/erigon-lib@v0.0.0-00010101000000-000000000000/recsplit/eliasfano32/elias_fano.go:171 +0x378
github.com/erigontech/erigon-lib/recsplit/eliasfano32.(*EliasFano).Get(...)
        github.com/erigontech/erigon-lib@v0.0.0-00010101000000-000000000000/recsplit/eliasfano32/elias_fano.go:187
github.com/erigontech/erigon-lib/state.(*BtIndex).keyCmp(0xc000b21b80, {0xc00d3ce600, 0x34, 0x34}, 0x0, {0x3279930, 0xc02118e970})
        github.com/erigontech/erigon-lib@v0.0.0-00010101000000-000000000000/state/btree_index.go:926 +0x19f
github.com/erigontech/erigon-lib/state.(*BpsTree).Get(0xc00561a440, {0x3279930, 0xc02118e970}, {0xc00d3ce600, 0x34, 0x34})
        github.com/erigontech/erigon-lib@v0.0.0-00010101000000-000000000000/state/bps_tree.go:385 +0x4dd
github.com/erigontech/erigon-lib/state.(*BtIndex).Get(0xc000b21b80, {0xc00d3ce600?, 0x10?, 0x491705?}, {0x3279930, 0xc02118e970})
        github.com/erigontech/erigon-lib@v0.0.0-00010101000000-000000000000/state/btree_index.go:1016 +0xa9
github.com/erigontech/erigon-lib/state.(*DomainRoTx).getFromFile(0xc18e468b40, 0x1, {0xc00d3ce600, 0x34, 0x34})
        github.com/erigontech/erigon-lib@v0.0.0-00010101000000-000000000000/state/domain.go:764 +0x518
github.com/erigontech/erigon-lib/state.(*DomainRoTx).getFromFiles(0xc18e468b40, {0xc00d3ce600, 0x34, 0x34})
        github.com/erigontech/erigon-lib@v0.0.0-00010101000000-000000000000/state/domain.go:1447 +0x3e6
github.com/erigontech/erigon-lib/state.(*DomainRoTx).GetLatest(0xc18e468b40, {0xc00d3ce600?, 0xc02a20b110?, 0x1139816?}, {0x0?, 0x4929fd?, 0xc
00d3ce600?}, {0x76380a043298?, 0xc196b20080?})
        github.com/erigontech/erigon-lib@v0.0.0-00010101000000-000000000000/state/domain.go:1649 +0x46a
github.com/erigontech/erigon-lib/state.(*AggregatorRoTx).GetLatest(0xc00d3ce600?, 0x34?, {0xc00d3ce600?, 0x10?, 0x10?}, {0x0?, 0x437b45?, 0xc1
a2f865c0?}, {0x76380a043298, 0xc196b20080})
        github.com/erigontech/erigon-lib@v0.0.0-00010101000000-000000000000/state/aggregator.go:2037 +0x4e
github.com/erigontech/erigon-lib/state.(*SharedDomains).DomainGet(0xc174b411e0, 0x1, {0xc00d3ce600?, 0x120?, 0xc02a20b288?}, {0x0?, 0x120?, 0x
7638584d7f18?})
        github.com/erigontech/erigon-lib@v0.0.0-00010101000000-000000000000/state/domain_shared.go:904 +0x14a
github.com/erigontech/erigon/core/state.(*ReaderV3).ReadAccountStorage(0xc196b20c00, {0xf6, 0x9, 0x8e, 0xcd, 0x23, 0xf3, 0xe6, 0xeb, 0xef, ...
}, ...)
        github.com/erigontech/erigon/core/state/rw_v3.go:617 +0x153
github.com/erigontech/erigon/core/state.(*stateObject).GetCommittedState(0xc0421491e0, 0xc15f433910, 0xc0b4ec0e80)
        github.com/erigontech/erigon/core/state/state_object.go:187 +0xc4
github.com/erigontech/erigon/core/state.(*stateObject).GetState(0xc0421491e0, 0xc15f433910, 0xc0b4ec0e80)
        github.com/erigontech/erigon/core/state/state_object.go:164 +0x8f
github.com/erigontech/erigon/core/state.(*IntraBlockState).GetState(0x50caa320a93537cc?, {0xf6, 0x9, 0x8e, 0xcd, 0x23, 0xf3, 0xe6, 0xeb, 0xef,
 ...}, ...)
        github.com/erigontech/erigon/core/state/intra_block_state.go:290 +0x4d
github.com/erigontech/erigon/core/vm.opSload(0x1?, 0x0?, 0x0?)
        github.com/erigontech/erigon/core/vm/instructions.go:557 +0xf7
github.com/erigontech/erigon/core/vm.(*EVMInterpreter).Run(0xc170930f18, 0xc174b41790, {0xc19c158bd0, 0x84, 0x84}, 0x0)
        github.com/erigontech/erigon/core/vm/interpreter.go:317 +0x944
github.com/erigontech/erigon/core/vm.run(...)
        github.com/erigontech/erigon/core/vm/evm.go:64
github.com/erigontech/erigon/core/vm.(*EVM).call(0xc01309d340, 0xf1, {0x324fba0, 0xc01d49fe48}, {0xf6, 0x9, 0x8e, 0xcd, 0x23, 0xf3, ...}, ...)
        github.com/erigontech/erigon/core/vm/evm.go:283 +0x1345
github.com/erigontech/erigon/core/vm.(*EVM).Call(...)
        github.com/erigontech/erigon/core/vm/evm.go:306
github.com/erigontech/erigon/core.(*StateTransition).TransitionDb(0xc15f4339e0, 0x1, 0x0)
        github.com/erigontech/erigon/core/state_transition.go:449 +0xd7c
github.com/erigontech/erigon/core.ApplyMessage(0x29b5e60?, {0x328e2b0?, 0xc0388948c0?}, 0x0?, 0x1, 0x0)
        github.com/erigontech/erigon/core/state_transition.go:160 +0x36
github.com/erigontech/erigon/cmd/state/exec3.(*Worker).RunTxTaskNoLock(0xc007dff688, 0xc16d97b508, 0x0)
        github.com/erigontech/erigon/cmd/state/exec3/state.go:272 +0x109f
github.com/erigontech/erigon/eth/stagedsync.ExecV3({_, _}, _, {_, _}, _, {{0x3284b30, 0xc005494030}, 0x20000000, {0x1, ...}, ...}, ...)
        github.com/erigontech/erigon/eth/stagedsync/exec3.go:836 +0x31b1
github.com/erigontech/erigon/eth/stagedsync.ExecBlockV3(_, {_, _}, {{_, _}, {_, _}, _}, _, {0x326cf30, ...}, ...)
        github.com/erigontech/erigon/eth/stagedsync/stage_execute.go:159 +0x1f9
github.com/erigontech/erigon/eth/stagedsync.SpawnExecuteBlocksStage(_, {_, _}, {{_, _}, {_, _}, _}, _, {0x326cf30, ...}, ...)
        github.com/erigontech/erigon/eth/stagedsync/stage_execute.go:250 +0x10e
github.com/erigontech/erigon/eth/stagedsync.PipelineStages.func10(0x9?, 0x0?, {0x3266680?, 0xc01309d600?}, {{0x0, 0x0}, {0x0, 0x0}, 0x0}, {0x3
283728, ...})
        github.com/erigontech/erigon/eth/stagedsync/default_stages.go:238 +0xf0
github.com/erigontech/erigon/eth/stagedsync.(*Sync).runStage(0xc01309d600, 0xc01953bc70, {0x3284b30, 0xc005494030}, {{0x0, 0x0}, {0x0, 0x0}, 0
x0}, 0x1, ...)
        github.com/erigontech/erigon/eth/stagedsync/sync.go:531 +0x190
github.com/erigontech/erigon/eth/stagedsync.(*Sync).Run(0xc01309d600, {0x3284b30, 0xc005494030}, {{0x0, 0x0}, {0x0, 0x0}, 0x0}, 0xe0?, 0x1)
        github.com/erigontech/erigon/eth/stagedsync/sync.go:410 +0x2ad
github.com/erigontech/erigon/turbo/stages.ProcessFrozenBlocks({0x326cf30, 0xc000953130}, {0x3284b30, 0xc005494030}, {0x32aa938, 0xc0015a86d0},
 0xc01309d600, 0x0)
        github.com/erigontech/erigon/turbo/stages/stageloop.go:151 +0x15b
github.com/erigontech/erigon/turbo/execution/eth1.(*EthereumExecutionModule).Start(0xc01856a500, {0x326cf30, 0xc000953130})
        github.com/erigontech/erigon/turbo/execution/eth1/ethereum_execution.go:337 +0x9f
created by github.com/erigontech/erigon/eth.(*Ethereum).Start in goroutine 1
        github.com/erigontech/erigon/eth/backend.go:1523 +0xddf
wmitsuda commented 2 months ago

interestingly, after a restart with the same version, it was able to continue syncing, not sure if it skipped whatever checks panic'ed in the previous run though

AskAlexSharov commented 2 months ago

Likely it's "wrong estimation" of downloader that "download is done".

Giulio2002 commented 2 months ago

I think restart should help here

Giulio2002 commented 2 months ago

but will check myself

Giulio2002 commented 2 months ago

actually this is very sus

Giulio2002 commented 2 months ago

your node needs to be nuked btw