ledgerwatch / erigon

Ethereum implementation on the efficiency frontier
GNU Lesser General Public License v3.0
3.03k stars 1.06k forks source link

ethmainnet: restart in the middle of merge - caused segfault on next merge #10733

Closed AskAlexSharov closed 2 days ago

AskAlexSharov commented 3 weeks ago

restarted ethmainnet in the middle of merge:

[INFO] [06-13|07:00:46.016] [4/12 Execution] starting                from=19898000 to=19917999 fromTxNum=2413664843 offsetFromBlockBeginning=0 initialCycle=true useExternalTx=true
[WARN] [06-13|07:01:25.228] lookupFileByItsRange: file not found     stepFrom=1536 stepTo=1538 domain=StorageKeys files=0-1024;1024-1536;1536-1540;1540-1542;1542-1543; _visibleFiles=0-1024;1024-1536;1536-1540;1540-1542;1542-1543; visibleFilesCount=5 filesCount=5
unexpected fault address 0x63266ea0b07f
fatal error: fault
[signal SIGSEGV: segmentation violation code=0x1 addr=0x63266ea0b07f pc=0xadc311]

goroutine 13304846 gp=0xc1d8277c00 m=27 mp=0xc2196fc008 [running, locked to thread]:
runtime.throw({0x29bdb1b?, 0xdd1175cc?})
    runtime/panic.go:1023 +0x5c fp=0xc26dd0add8 sp=0xc26dd0ada8 pc=0x45febc
runtime.sigpanic()
    runtime/signal_unix.go:895 +0x285 fp=0xc26dd0ae38 sp=0xc26dd0add8 pc=0x478ca5
github.com/ledgerwatch/erigon-lib/seg.(*Getter).nextPos(0xc0012822a0?, 0x14?)
    github.com/ledgerwatch/erigon-lib@v1.0.0/seg/decompress.go:563 +0xb1 fp=0xc26dd0ae58 sp=0xc26dd0ae38 pc=0xadc311
github.com/ledgerwatch/erigon-lib/seg.(*Getter).NextUncompressed(0xc231f05090)
    github.com/ledgerwatch/erigon-lib@v1.0.0/seg/decompress.go:744 +0x6b fp=0xc26dd0aec0 sp=0xc26dd0ae58 pc=0xadd06b
github.com/ledgerwatch/erigon-lib/state.(*getter).Next(0xc217576920?, {0x0?, 0x100c26dd0b0e8?, 0x3e14dbf?})
    github.com/ledgerwatch/erigon-lib@v1.0.0/state/archive.go:62 +0x4c fp=0xc26dd0aef0 sp=0xc26dd0aec0 pc=0x11ff74c
github.com/ledgerwatch/erigon-lib/state.(*BtIndex).keyCmp(0xc0251f3d80, {0xc1e55382f8, 0x4, 0x4}, 0x81, {0x323e850, 0xc4d8f188f0})
    github.com/ledgerwatch/erigon-lib@v1.0.0/state/btree_index.go:881 +0x1ef fp=0xc26dd0afe8 sp=0xc26dd0aef0 pc=0x120814f
github.com/ledgerwatch/erigon-lib/state.(*BtIndex).keyCmp-fm({0xc1e55382f8?, 0xc1e55382f8?, 0x4?}, 0x4?, {0x323e850?, 0xc4d8f188f0?})
    <autogenerated>:1 +0x45 fp=0xc26dd0b030 sp=0xc26dd0afe8 pc=0x12678a5
github.com/ledgerwatch/erigon-lib/state.(*BpsTree).Get(0xc0e6f38a00, {0x323e850, 0xc4d8f188f0}, {0xc1e55382f8, 0x4, 0x4})
    github.com/ledgerwatch/erigon-lib@v1.0.0/state/bps_tree.go:285 +0x4f8 fp=0xc26dd0b218 sp=0xc26dd0b030 pc=0x12018f8
github.com/ledgerwatch/erigon-lib/state.(*BtIndex).Get(0xc0251f3d80, {0xc1e55382f8?, 0xc26dd0b2f0?, 0xabe925?}, {0x323e850, 0xc4d8f188f0})
    github.com/ledgerwatch/erigon-lib@v1.0.0/state/btree_index.go:961 +0xa9 fp=0xc26dd0b2b0 sp=0xc26dd0b218 pc=0x1208889
github.com/ledgerwatch/erigon-lib/state.(*DomainRoTx).getFromFile(0xc22c309080, 0x5, {0xc1e55382f8, 0x4, 0x4})
    github.com/ledgerwatch/erigon-lib@v1.0.0/state/domain.go:651 +0x3c6 fp=0xc26dd0b340 sp=0xc26dd0b2b0 pc=0x120e166
github.com/ledgerwatch/erigon-lib/state.(*DomainRoTx).getFromFiles(0xc22c309080, {0xc1e55382f8, 0x4, 0x4})
    github.com/ledgerwatch/erigon-lib@v1.0.0/state/domain.go:1364 +0x29e fp=0xc26dd0b500 sp=0xc26dd0b340 pc=0x121527e
github.com/ledgerwatch/erigon-lib/state.(*SharedDomains).LatestCommitment(0xc53df10dd0, {0xc1e55382f8, 0x4, 0x4})
    github.com/ledgerwatch/erigon-lib@v1.0.0/state/domain_shared.go:349 +0xca fp=0xc26dd0b5b0 sp=0xc26dd0b500 pc=0x12243ca
github.com/ledgerwatch/erigon-lib/state.(*SharedDomainsCommitmentContext).GetBranch(0xc2b368f680, {0xc1e55382f8, 0x4, 0x4})
    github.com/ledgerwatch/erigon-lib@v1.0.0/state/domain_shared.go:1012 +0x90 fp=0xc26dd0b650 sp=0xc26dd0b5b0 pc=0x122a830
github.com/ledgerwatch/erigon-lib/commitment.(*HexPatriciaHashed).unfoldBranchNode(0xc302f8e000, 0x6, 0x0, 0x7)
    github.com/ledgerwatch/erigon-lib@v1.0.0/commitment/hex_patricia_hashed.go:821 +0xc3 fp=0xc26dd0b910 sp=0xc26dd0b650 pc=0x11d3703
github.com/ledgerwatch/erigon-lib/commitment.(*HexPatriciaHashed).unfold(0xc302f8e000, {0xc246615c00?, 0x40?, 0x80?}, 0x1)
    github.com/ledgerwatch/erigon-lib@v1.0.0/commitment/hex_patricia_hashed.go:929 +0x605 fp=0xc26dd0baa0 sp=0xc26dd0b910 pc=0x11d4a85
github.com/ledgerwatch/erigon-lib/commitment.(*HexPatriciaHashed).ProcessTree.func1({0xc246615c00, 0x40, 0x80}, {0xc24661f500, 0x14, 0x34})
    github.com/ledgerwatch/erigon-lib@v1.0.0/commitment/hex_patricia_hashed.go:1306 +0x5e5 fp=0xc26dd0bcf0 sp=0xc26dd0baa0 pc=0x11d7de5
github.com/ledgerwatch/erigon-lib/commitment.(*UpdateTree).HashSort.func1({0xc246615c00?, 0x0?, 0xc26dd0bd78?}, {0xc24661f500?, 0xc26dd0bd78?, 0xac6e93?}, {0xc26dd0beb8?, 0x0?}, 0x1?)
    github.com/ledgerwatch/erigon-lib@v1.0.0/commitment/commitment.go:946 +0x23 fp=0xc26dd0bd30 sp=0xc26dd0bcf0 pc=0x11ce683
github.com/ledgerwatch/erigon-lib/etl.(*Collector).Load.func2({0xc246615c00?, 0xc24661f540?, 0x0?}, {0xc24661f500?, 0xc24661cdf8?, 0x0?})
    github.com/ledgerwatch/erigon-lib@v1.0.0/etl/collector.go:259 +0x31 fp=0xc26dd0bd88 sp=0xc26dd0bd30 pc=0xac2751
    github.com/ledgerwatch/erigon-lib@v1.0.0/etl/collector.go:345 +0x86e fp=0xc26dd0bf70 sp=0xc26dd0bd88 pc=0xac3bce
github.com/ledgerwatch/erigon-lib/etl.(*Collector).Load(0xc233fc0200, {0x0, 0x0}, {0x0, 0x0}, 0xc3ccfb6530, {0xc0026df8c0, 0x0, 0x0, {0x0, ...}, ...})
    github.com/ledgerwatch/erigon-lib@v1.0.0/etl/collector.go:261 +0x666 fp=0xc26dd0c100 sp=0xc26dd0bf70 pc=0xac24c6
github.com/ledgerwatch/erigon-lib/commitment.(*UpdateTree).HashSort(0xc383dab100, {0x3231ce8, 0xc00122a230}, 0xc234390140)
    github.com/ledgerwatch/erigon-lib@v1.0.0/commitment/commitment.go:945 +0x59c fp=0xc26dd0c2d0 sp=0xc26dd0c100 pc=0x11ce4dc
github.com/ledgerwatch/erigon-lib/commitment.(*HexPatriciaHashed).ProcessTree(0xc302f8e000, {0x3231ce8, 0xc00122a230}, 0xc383dab100, {0xc2343a7d40, 0x12})
    github.com/ledgerwatch/erigon-lib@v1.0.0/commitment/hex_patricia_hashed.go:1286 +0x25c fp=0xc26dd0c408 sp=0xc26dd0c2d0 pc=0x11d723c
github.com/ledgerwatch/erigon-lib/state.(*SharedDomainsCommitmentContext).ComputeCommitment(0xc2b368f680, {0x3231ce8, 0xc00122a230}, 0x1, 0x12fa7e5, {0xc2343a7d40, 0x12})
    github.com/ledgerwatch/erigon-lib@v1.0.0/state/domain_shared.go:1146 +0x5f3 fp=0xc26dd0c650 sp=0xc26dd0c408 pc=0x122b9f3
github.com/ledgerwatch/erigon-lib/state.(*SharedDomains).ComputeCommitment(0xc53df10dd0, {0x3231ce8?, 0xc00122a230?}, 0x1?, 0x1?, {0xc2343a7d40?, 0x0?})
    github.com/ledgerwatch/erigon-lib@v1.0.0/state/domain_shared.go:612 +0x59 fp=0xc26dd0c698 sp=0xc26dd0c650 pc=0x1226fd9
github.com/ledgerwatch/erigon/core/state.(*StateV3).ApplyState4(0xc27ac52e00, {0x3231ce8, 0xc00122a230}, 0xc2343ce708)
    github.com/ledgerwatch/erigon/core/state/rw_v3.go:185 +0x1d9 fp=0xc26dd0c6f0 sp=0xc26dd0c698 pc=0x1273779
github.com/ledgerwatch/erigon/eth/stagedsync.ExecV3({_, _}, _, {_, _}, _, {{0x32493d0, 0xc08317d950}, 0x40000000, {0x1, ...}, ...}, ...)
    github.com/ledgerwatch/erigon/eth/stagedsync/exec3.go:844 +0x2d67 fp=0xc26dd0d608 sp=0xc26dd0c6f0 pc=0x21bf687
github.com/ledgerwatch/erigon/eth/stagedsync.ExecBlockV3(_, {_, _}, {{_, _}, {_, _}, _}, _, {0x3231ce8, ...}, ...)
    github.com/ledgerwatch/erigon/eth/stagedsync/stage_execute.go:297 +0x205 fp=0xc26dd0d918 sp=0xc26dd0d608 pc=0x21d8045
github.com/ledgerwatch/erigon/eth/stagedsync.SpawnExecuteBlocksStage(_, {_, _}, {{_, _}, {_, _}, _}, _, {0x3231ce8, ...}, ...)
    github.com/ledgerwatch/erigon/eth/stagedsync/stage_execute.go:428 +0x1c5 fp=0xc26dd0dc00 sp=0xc26dd0d918 pc=0x21d9865
github.com/ledgerwatch/erigon/eth/stagedsync.PipelineStages.func10(0x9?, 0x70af87100080?, {0x322bc70?, 0xc040088420?}, {{0x32697d0, 0xc3cea492f0}, {0x0, 0x0}, 0x0}, {0x3247fc8, ...})
    github.com/ledgerwatch/erigon/eth/stagedsync/default_stages.go:320 +0xee fp=0xc26dd0dec0 sp=0xc26dd0dc00 pc=0x21b398e
github.com/ledgerwatch/erigon/eth/stagedsync.(*Sync).runStage(0xc040088420, 0xc0400778b0, {0x32493d0, 0xc08317d950}, {{0x32697d0, 0xc3cea492f0}, {0x0, 0x0}, 0x0}, 0x1, ...)
    github.com/ledgerwatch/erigon/eth/stagedsync/sync.go:513 +0x190 fp=0xc26dd0dfc0 sp=0xc26dd0dec0 pc=0x22292f0
github.com/ledgerwatch/erigon/eth/stagedsync.(*Sync).Run(0xc040088420, {0x32493d0, 0xc08317d950}, {{0x32697d0, 0xc3cea492f0}, {0x0, 0x0}, 0x0}, 0x90?, 0x0)
    github.com/ledgerwatch/erigon/eth/stagedsync/sync.go:383 +0x2c7 fp=0xc26dd0e080 sp=0xc26dd0dfc0 pc=0x22282a7
github.com/ledgerwatch/erigon/turbo/execution/eth1.(*EthereumExecutionModule).updateForkChoice(0xc000b7edc0, {0x3231ce8, 0xc00122a230}, {0xc7, 0x3a, 0x2d, 0x8d, 0x32, 0x9b, 0x26, ...}, ...)
    github.com/ledgerwatch/erigon/turbo/execution/eth1/forkchoice.go:375 +0x10a5 fp=0xc26dd0ff50 sp=0xc26dd0e080 pc=0x225ace5
github.com/ledgerwatch/erigon/turbo/execution/eth1.(*EthereumExecutionModule).UpdateForkChoice.gowrap1()
    github.com/ledgerwatch/erigon/turbo/execution/eth1/forkchoice.go:96 +0x69 fp=0xc26dd0ffe0 sp=0xc26dd0ff50 pc=0x2259949
runtime.goexit({})
    runtime/asm_amd64.s:1695 +0x1 fp=0xc26dd0ffe8 sp=0xc26dd0ffe0 pc=0x49b221
created by github.com/ledgerwatch/erigon/turbo/execution/eth1.(*EthereumExecutionModule).UpdateForkChoice in goroutine 8049699
    github.com/ledgerwatch/erigon/turbo/execution/eth1/forkchoice.go:96 +0x30a
ls /erigon-data/snapshots/domain/
v1-accounts.0-1024.bt          v1-accounts.1542-1543.kv    v1-code.1540-1542.kvei           v1-commitment.1538-1539.bt    v1-storage.1536-1537.kv
v1-accounts.0-1024.kv          v1-accounts.1542-1543.kvei  v1-code.1541-1542.bt             v1-commitment.1538-1539.kv    v1-storage.1536-1537.kvei
v1-accounts.0-1024.kv.torrent  v1-accounts.1543-1544.bt    v1-code.1541-1542.kv             v1-commitment.1538-1539.kvei  v1-storage.1536-1540.bt
v1-accounts.0-1024.kvei        v1-accounts.1543-1544.kv    v1-code.1541-1542.kvei           v1-commitment.1539-1540.bt    v1-storage.1536-1540.kv
v1-accounts.1024-1536.bt       v1-accounts.1543-1544.kvei  v1-code.1542-1543.bt             v1-commitment.1539-1540.kv    v1-storage.1536-1540.kvei
v1-accounts.1024-1536.kv       v1-code.0-1024.bt           v1-code.1542-1543.kv             v1-commitment.1539-1540.kvei  v1-storage.1537-1538.bt
v1-accounts.1024-1536.kvei     v1-code.0-1024.kv           v1-code.1542-1543.kvei           v1-commitment.1540-1541.bt    v1-storage.1537-1538.kv
v1-accounts.1536-1537.bt       v1-code.0-1024.kv.torrent   v1-code.1543-1544.bt             v1-commitment.1540-1541.kv    v1-storage.1537-1538.kvei
v1-accounts.1536-1537.kv       v1-code.0-1024.kvei         v1-code.1543-1544.kv             v1-commitment.1540-1541.kvei  v1-storage.1540-1541.bt
v1-accounts.1536-1537.kvei     v1-code.1024-1536.bt        v1-code.1543-1544.kvei           v1-commitment.1541-1542.bt    v1-storage.1540-1541.kv
v1-accounts.1536-1540.bt       v1-code.1024-1536.kv        v1-commitment.0-1024.bt          v1-commitment.1541-1542.kv    v1-storage.1540-1541.kvei
v1-accounts.1536-1540.kv       v1-code.1024-1536.kvei      v1-commitment.0-1024.kv          v1-commitment.1541-1542.kvei  v1-storage.1540-1542.bt
v1-accounts.1536-1540.kvei     v1-code.1536-1537.bt        v1-commitment.0-1024.kv.torrent  v1-commitment.1542-1543.bt    v1-storage.1540-1542.kv
v1-accounts.1537-1538.bt       v1-code.1536-1537.kv        v1-commitment.0-1024.kvei        v1-commitment.1542-1543.kv    v1-storage.1540-1542.kvei
v1-accounts.1537-1538.kv       v1-code.1536-1537.kvei      v1-commitment.1024-1536.bt       v1-commitment.1542-1543.kvei  v1-storage.1541-1542.bt
v1-accounts.1537-1538.kvei     v1-code.1536-1540.bt        v1-commitment.1024-1536.kv       v1-commitment.1543-1544.bt    v1-storage.1541-1542.kv
v1-accounts.1540-1541.bt       v1-code.1536-1540.kv        v1-commitment.1024-1536.kvei     v1-commitment.1543-1544.kv    v1-storage.1541-1542.kvei
v1-accounts.1540-1541.kv       v1-code.1536-1540.kvei      v1-commitment.1536-1537.bt       v1-commitment.1543-1544.kvei  v1-storage.1542-1543.bt
v1-accounts.1540-1541.kvei     v1-code.1537-1538.bt        v1-commitment.1536-1537.kv       v1-storage.0-1024.bt          v1-storage.1542-1543.kv
v1-accounts.1540-1542.bt       v1-code.1537-1538.kv        v1-commitment.1536-1537.kvei     v1-storage.0-1024.kv          v1-storage.1542-1543.kvei
v1-accounts.1540-1542.kv       v1-code.1537-1538.kvei      v1-commitment.1536-1538.bt       v1-storage.0-1024.kv.torrent  v1-storage.1543-1544.bt
v1-accounts.1540-1542.kvei     v1-code.1540-1541.bt        v1-commitment.1536-1538.kv       v1-storage.0-1024.kvei        v1-storage.1543-1544.kv
v1-accounts.1541-1542.bt       v1-code.1540-1541.kv        v1-commitment.1536-1538.kvei     v1-storage.1024-1536.bt       v1-storage.1543-1544.kvei
v1-accounts.1541-1542.kv       v1-code.1540-1541.kvei      v1-commitment.1537-1538.bt       v1-storage.1024-1536.kv
v1-accounts.1541-1542.kvei     v1-code.1540-1542.bt        v1-commitment.1537-1538.kv       v1-storage.1024-1536.kvei
v1-accounts.1542-1543.bt       v1-code.1540-1542.kv        v1-commitment.1537-1538.kvei     v1-storage.1536-1537.bt
AskAlexSharov commented 3 weeks ago

Then i did: erigon snapshots rm-state-snapshots --step=1536-99999 --datadir="/erigon-data" and at startup got this error:

[CRIT] [06-13|08:24:07.981] lookupByShortenedKey panics              err="file: v1-accounts.1024-1536.kv, runtime error: slice bounds out of range [48990140:48990139], [decompress.go:741 panic.go:770 panic.go:160 decompress.go:760 archive.go:62 domain_committed.go:217 domain_shared.go:419 commitment.go:368 domain_shared.go:399 domain_shared.go:359 domain_shared.go:1012 hex_patricia_hashed.go:821 hex_patricia_hashed.go:929 hex_patricia_hashed.go:1306 commitment.go:946 collector.go:259 collector.go:345 collector.go:261 commitment.go:945 hex_patricia_hashed.go:1286 domain_shared.go:1146 domain_shared.go:612 rw_v3.go:185 exec3.go:844 stage_execute.go:297 stage_execute.go:428 default_stages.go:320 sync.go:513 sync.go:383 stageloop.go:126 ethereum_execution.go:313 asm_amd64.s:1695]" domain=AccountKeys offset=48990139 short=bb8fae17 cleanFilesCount=2 dirtyFilesCount=2 file=v1-accounts.1024-1536.kv
blxdyx commented 1 week ago

Same problem, how can temporary fix?

awskii commented 2 days ago

@blxdyx you have to remove most recently meged files of range bigger than 2 steps (e.g. 4 steps in it). Also pull main because issue seems solved.

To ensure data files are correct would be easier just remove some file range with $ ./build/bin/erigon snapshots rm-state-snapshots --datadir <datadir> --step FROM-9999 $ rm -rf <datadir>/chaindata

Some discrepancy will be harder to fix later on so i assume you to pull main and do cleanup before restart.

awskii commented 2 days ago

Closed by #11011