kroma-network / kroma-up

Set up Kroma using Docker
47 stars 20 forks source link

pebble: inconsistent batch count: 3548743553 vs 3580277235 #47

Open xinzhongyoumeng opened 3 weeks ago

xinzhongyoumeng commented 3 weeks ago

Issue type

Bug

OS platform and distribution

Ubuntu 22.04

Current behavior?

restart the geth client will panic when use pebble as the backing database. The log is below:

INFO [08-21|19:11:46.649] Chain head was updated                   number=15,179,128 hash=81f1e4..dfe321 root=2460e5..f45bbd elapsed="415.821µs"  age=1m3s
INFO [08-21|19:11:46.661] Starting work on payload                 id=0xf3b0df4b5afad187
INFO [08-21|19:11:46.673] Imported new potential chain segment     number=15,179,129 hash=b529f0..bfcb2e blocks=1          txs=2          mgas=0.243  elapsed=9.982ms      mgasps=24.387  age=1m1s      triedirty=7.94GiB
INFO [08-21|19:11:46.673] Chain head was updated                   number=15,179,129 hash=b529f0..bfcb2e root=17632d..ba7618 elapsed="400.061µs"  age=1m1s
INFO [08-21|19:11:46.687] Starting work on payload                 id=0xb2344649ebe53db6
INFO [08-21|19:11:46.700] Imported new potential chain segment     number=15,179,130 hash=f3f9d8..dd819b blocks=1          txs=4          mgas=0.224  elapsed=12.045ms     mgasps=18.575  triedirty=7.94GiB
INFO [08-21|19:11:46.701] Chain head was updated                   number=15,179,130 hash=f3f9d8..dd819b root=0710d2..df9adb elapsed="408.351µs"
INFO [08-21|19:11:47.817] Starting work on payload                 id=0xb62de5f6065bd0fc
INFO [08-21|19:11:48.010] Imported new potential chain segment     number=15,179,131 hash=34c5f1..594e8e blocks=1          txs=2          mgas=11.460 elapsed=189.355ms    mgasps=60.520  triedirty=7.94GiB
INFO [08-21|19:11:48.012] Chain head was updated                   number=15,179,131 hash=34c5f1..594e8e root=171d3e..461ee6 elapsed=1.166792ms
INFO [08-21|19:17:59.855] Got interrupt, shutting down...
INFO [08-21|19:17:59.855] HTTP server stopped                      endpoint=[::]:8545
INFO [08-21|19:17:59.855] HTTP server stopped                      endpoint=[::]:8546
INFO [08-21|19:17:59.855] HTTP server stopped                      endpoint=127.0.0.1:8551
INFO [08-21|19:17:59.855] IPC endpoint closed                      url=/mnt/kromain/node/geth/geth.ipc
INFO [08-21|19:17:59.856] Ethereum protocol stopped
INFO [08-21|19:17:59.856] Transaction pool stopped
INFO [08-21|19:17:59.856] Writing cached state to disk             block=15,179,131 hash=34c5f1..594e8e root=171d3e..461ee6
[2024-08-21 19:17:59] [geth_command.sh] receive the service exit signal
[2024-08-21 19:17:59] [geth_command.sh] get service pid [164020] by command [/opt/kromain/core/geth]
[2024-08-21 19:17:59] [geth_command.sh] exec command [kill -15 164020], try count [1]
WARN [08-21|19:17:59.863] Already shutting down, interrupt more to panic. times=9
[2024-08-21 19:18:09] [geth_command.sh] get service pid [164020] by command [/opt/kromain/core/geth]
[2024-08-21 19:18:09] [geth_command.sh] exec command [kill -15 164020], try count [2]
WARN [08-21|19:18:09.870] Already shutting down, interrupt more to panic. times=8
INFO [08-21|19:18:09.898] Looking for peers                        peercount=0 tried=117 static=0
[2024-08-21 19:18:19] [geth_command.sh] get service pid [164020] by command [/opt/kromain/core/geth]
[2024-08-21 19:18:19] [geth_command.sh] exec command [kill -15 164020], try count [3]
WARN [08-21|19:18:19.878] Already shutting down, interrupt more to panic. times=7
INFO [08-21|19:18:19.913] Looking for peers                        peercount=0 tried=122 static=0
[2024-08-21 19:18:29] [geth_command.sh] get service pid [164020] by command [/opt/kromain/core/geth]
[2024-08-21 19:18:29] [geth_command.sh] exec command [kill -15 164020], try count [4]
WARN [08-21|19:18:29.885] Already shutting down, interrupt more to panic. times=6
INFO [08-21|19:18:30.217] Looking for peers                        peercount=0 tried=110 static=0
[2024-08-21 19:18:39] [geth_command.sh] get service pid [164020] by command [/opt/kromain/core/geth]
[2024-08-21 19:18:39] [geth_command.sh] exec command [kill -15 164020], try count [5]
WARN [08-21|19:18:39.892] Already shutting down, interrupt more to panic. times=5
INFO [08-21|19:18:40.235] Looking for peers                        peercount=0 tried=123 static=0
[2024-08-21 19:18:49] [geth_command.sh] get service pid [164020] by command [/opt/kromain/core/geth]
[2024-08-21 19:18:49] [geth_command.sh] exec command [kill -15 164020], try count [6]
WARN [08-21|19:18:49.899] Already shutting down, interrupt more to panic. times=4
INFO [08-21|19:18:50.241] Looking for peers                        peercount=0 tried=83  static=0
INFO [08-21|19:18:58.078] Persisted trie from memory database      nodes=31,533,993 size=3.89GiB time=58.221857392s livenodes=0 livesize=4.03GiB
INFO [08-21|19:18:58.078] Writing cached state to disk             block=15,179,130 hash=f3f9d8..dd819b root=0710d2..df9adb
INFO [08-21|19:18:58.078] Persisted trie from memory database      nodes=0          size=0.00B   time="3.09µs"      livenodes=0 livesize=4.03GiB
INFO [08-21|19:18:58.078] Writing cached state to disk             block=15,179,004 hash=9143dd..0e36eb root=00934e..efa94f
INFO [08-21|19:18:58.078] Persisted trie from memory database      nodes=0          size=0.00B   time=310ns         livenodes=0 livesize=4.03GiB
ERROR[08-21|19:18:58.078] Dangling trie nodes after full cleanup
INFO [08-21|19:18:58.078] Blockchain stopped
[2024-08-21 19:19:02] [geth_command.sh] exec command [/opt/kromain/core/geth --networkid=255 --datadir=/mnt/kromain/node/geth --gcmode=full --syncmode=full --db.engine=pebble --state.scheme=hash --snapshot=false --http --http.addr=0.0.0.0 --http.port=8545 --http.vhosts=* --http.corsdomain=* --http.api=admin,eth,net,web3,personal,debug,txpool,engine,kroma --ws --ws.addr=0.0.0.0 --ws.port=8546 --ws.origins=* --ws.api=admin,eth,net,web3,personal,debug,txpool,engine,kroma --rpc.gascap=0 --rpc.txfeecap=0 --txpool.globalslots=10000 --txpool.globalqueue=5000 --authrpc.addr=127.0.0.1 --authrpc.port=8551 --authrpc.vhosts=* --authrpc.jwtsecret=/mnt/kromain/node/jwt.hex --maxpeers=200 --port=30303 --discovery.port=30303 --bootnodes=enode://acf4af39e9d9f3bbdb700315fde6e909efc714bd9a012b0f6943c4f9110008e3da2cb6c2ac9d3b6d98184a5ead57c4409be4ae7b19da1f554ceee7ba86c1fc2e@p2p-0.kroma.network:30304?discport=30303,enode://a3b6a7087b3399eefdb1ce5870aa0e58a60bfeccf3f7f7c02f5e142b1a544d19c171f7688d08db786d5e9b3ba734482bf8cf29dab7d46917b9da7f61656fde39@p2p-1.kroma.network:30304?discport=30303]
INFO [08-21|19:19:02.446] Maximum peer count                       ETH=200 total=200
INFO [08-21|19:19:02.447] Smartcard socket not found, disabling    err="stat /run/pcscd/pcscd.comm: no such file or directory"
INFO [08-21|19:19:02.447] Using pebble as db engine
INFO [08-21|19:19:02.449] Global gas cap disabled
INFO [08-21|19:19:02.449] Initializing the KZG library             backend=gokzg
INFO [08-21|19:19:02.475] Allocated trie memory caches             clean=256.00MiB dirty=256.00MiB
INFO [08-21|19:19:02.558] Using pebble as the backing database
INFO [08-21|19:19:02.558] Allocated cache and file handles         database=/mnt/kromain/node/geth/geth/chaindata cache=512.00MiB handles=524,288
Fatal: Failed to register the Ethereum service: pebble: inconsistent batch count: 3548743553 vs 3580277235
[2024-08-21 19:19:04] [geth_command.sh] exec command [/opt/kromain/core/geth --networkid=255 --datadir=/mnt/kromain/node/geth --gcmode=full --syncmode=full --db.engine=pebble --state.scheme=hash --snapshot=false --http --http.addr=0.0.0.0 --http.port=8545 --http.vhosts=* --http.corsdomain=* --http.api=admin,eth,net,web3,personal,debug,txpool,engine,kroma --ws --ws.addr=0.0.0.0 --ws.port=8546 --ws.origins=* --ws.api=admin,eth,net,web3,personal,debug,txpool,engine,kroma --rpc.gascap=0 --rpc.txfeecap=0 --txpool.globalslots=10000 --txpool.globalqueue=5000 --authrpc.addr=127.0.0.1 --authrpc.port=8551 --authrpc.vhosts=* --authrpc.jwtsecret=/mnt/kromain/node/jwt.hex --maxpeers=200 --port=30303 --discovery.port=30303 --bootnodes=enode://acf4af39e9d9f3bbdb700315fde6e909efc714bd9a012b0f6943c4f9110008e3da2cb6c2ac9d3b6d98184a5ead57c4409be4ae7b19da1f554ceee7ba86c1fc2e@p2p-0.kroma.network:30304?discport=30303,enode://a3b6a7087b3399eefdb1ce5870aa0e58a60bfeccf3f7f7c02f5e142b1a544d19c171f7688d08db786d5e9b3ba734482bf8cf29dab7d46917b9da7f61656fde39@p2p-1.kroma.network:30304?discport=30303]
INFO [08-21|19:19:04.859] Maximum peer count                       ETH=200 total=200
INFO [08-21|19:19:04.860] Smartcard socket not found, disabling    err="stat /run/pcscd/pcscd.comm: no such file or directory"
INFO [08-21|19:19:04.860] Using pebble as db engine
INFO [08-21|19:19:04.862] Global gas cap disabled
INFO [08-21|19:19:04.862] Initializing the KZG library             backend=gokzg
INFO [08-21|19:19:04.888] Allocated trie memory caches             clean=256.00MiB dirty=256.00MiB
INFO [08-21|19:19:04.973] Using pebble as the backing database
INFO [08-21|19:19:04.974] Allocated cache and file handles         database=/mnt/kromain/node/geth/geth/chaindata cache=512.00MiB handles=524,288
Fatal: Failed to register the Ethereum service: pebble: inconsistent batch count: 3548743553 vs 3580277235
...

the geth will circular restart.As you can see the ERROR is Dangling trie nodes after full cleanup. What does that mean?

Expected Behavior?

The geth client can restart normally.

Standalone code or description to reproduce the issue

restart the geth client when use pebble.

Additional context

Because the snapshot is use pebble, so we also use the pebble.

xinzhongyoumeng commented 3 weeks ago

I find the reason is the in-memory cache was not fully flushed to the underlying database during shutdown.(https://docs.goquorum.consensys.io/reference/logging-and-errors). I don't know the kroma geth client if can do something for this error?

Pangssu commented 3 weeks ago

I'm not sure what is causing the error you are getting from pebble db, but it could be database corruption due to an unclean shutdown. If it still happens after rebooting the system and trying again, you may need to delete the database and sync over again.