pocketnetteam / pocketnet.core

Decentralized social network based on the blockchain
https://pocketnet.app
Apache License 2.0
114 stars 28 forks source link

Node crashes sometimes on rollback: ERROR: UndoReadFromDisk: Deserialize or I/O error 2 (23) #67

Closed dmaltsiniotis closed 2 years ago

dmaltsiniotis commented 2 years ago

Describe the bug There appears to be what I assume is a bug with the block rollback part of the pocketcoind process. Every so often, all nodes I manage crash simultaneously after an "RIDB rollback" event. The nodes are located in geographically separated areas on different servers, so the issue is not hardware related, it seems to be network/protocol related or a bug in pocketcoind.

The actual exception seems to be: ERROR: UndoReadFromDisk: Deserialize or I/O error 2 (23) - CAutoFile::read: end of file: iostream error

To Reproduce Steps to reproduce the behavior: Run a staking node for a period of 3-5 days, the error usually occurs with that frequency.

Expected behavior The Node should stay up and not auto-shutdown to due a crash.

Screenshots Logs from Node 01:

2021-10-19T16:23:57Z ERROR: Failed find reindexer transaction part (cc6da0f88ad233091dc2c2589ee1005625648cb12d3236f8891b4dc6f142a8c7)
2021-10-19T16:23:57Z ERROR: ConnectTip: ConnectBlock 91f3a37e5c1f2d650ecaf8c7349f24432efee80461c73ab28fc401ee00e4c56f failed, failed-find-rtransaction (code 96)
2021-10-19T16:23:57Z Available coins count 3 BestHeader: 1402233 dd3a47ab57af27324a54d975b6850967dfffe9d0c0f56dc769465958c6494ae3
2021-10-19T16:24:04Z RIDB rollback to block height 1402232 success!
2021-10-19T16:24:14Z 700004: Transaction RI data Not Found (c211abea871010c61a2e851464ccc396b0a280d3181bd570aa68ad9f1722a994)
2021-10-19T16:24:14Z ERROR: Failed find reindexer transaction part (c211abea871010c61a2e851464ccc396b0a280d3181bd570aa68ad9f1722a994)
2021-10-19T16:24:14Z ERROR: ConnectTip: ConnectBlock 06a410b6460c589e18f1ecfd46c07eac96fa494e9153630c4be69cc1d7e228fb failed, failed-find-rtransaction (code 96)
2021-10-19T16:24:15Z Available coins count 3 BestHeader: 1402233 b1f2a67cb8395aaddfbdc7fa9d96c0d690d78bd502b79e5c50bb256709f8d03d
2021-10-19T16:24:16Z socket recv error Connection reset by peer (104)
2021-10-19T16:24:20Z BlockUntilSyncedToCurrentChain: txindex is catching up on block notifications
2021-10-19T16:24:22Z RIDB rollback to block height 1402234 success!
2021-10-19T16:24:23Z ERROR: UndoReadFromDisk: Deserialize or I/O error 2 (23) - CAutoFile::read: end of file: iostream error
2021-10-19T16:24:23Z ERROR: DisconnectBlock(): failure reading undo data
2021-10-19T16:24:23Z ERROR: DisconnectTip(): DisconnectBlock 06a410b6460c589e18f1ecfd46c07eac96fa494e9153630c4be69cc1d7e228fb failed
2021-10-19T16:24:23Z *** Failed to disconnect block; see debug.log for details
2021-10-19T16:24:23Z Error: Error: A fatal internal error occurred, see debug.log for details
2021-10-19T16:24:23Z ERROR: ProcessNewBlock: ActivateBestChain failed (Failed to disconnect block; see debug.log for details (code 0))
2021-10-19T16:24:23Z tor: Thread interrupt
2021-10-19T16:24:23Z Shutdown: In progress...
2021-10-19T16:24:23Z torcontrol thread exit
2021-10-19T16:24:23Z addcon thread exit
2021-10-19T16:24:23Z opencon thread exit
2021-10-19T16:24:23Z net thread exit
2021-10-19T16:24:23Z msghand thread exit
2021-10-19T16:24:23Z Available coins count 3 BestHeader: 1402234 06a410b6460c589e18f1ecfd46c07eac96fa494e9153630c4be69cc1d7e228fb
2021-10-19T16:24:23Z ~HTTPRequest: Unhandled request
2021-10-19T16:24:23Z ~HTTPRequest: Unhandled request
2021-10-19T16:24:23Z ~HTTPRequest: Unhandled request
2021-10-19T16:24:23Z ~HTTPRequest: Unhandled request
2021-10-19T16:24:23Z ~HTTPRequest: Unhandled request
2021-10-19T16:24:23Z Pocketcoin Staker terminated
2021-10-19T16:24:23Z scheduler thread interrupt
2021-10-19T16:24:23Z [Staking Wallet] keypool return 1
2021-10-19T16:24:23Z Dumped mempool: 0.000159s to copy, 0.009723s to dump
2021-10-19T16:24:52Z Close reindexer DB
2021-10-19T16:24:52Z [Staking Wallet] Releasing wallet
2021-10-19T16:24:52Z Shutdown: done

Logs from Node 02:

2021-10-19T16:23:40Z BlockUntilSyncedToCurrentChain: txindex is catching up on block notifications
2021-10-19T16:23:40Z socket recv error Connection reset by peer (104)
2021-10-19T16:23:44Z Available coins count 0 BestHeader: 1402234 23dce4a1eebab67a38464bf846325e428075f4d1ca42f6639f27408b4e5aeaff
2021-10-19T16:23:48Z RIDB rollback to block height 1402233 success!
2021-10-19T16:23:48Z ERROR: UndoReadFromDisk: Deserialize or I/O error 2 (23) - CAutoFile::read: end of file: iostream error
2021-10-19T16:23:48Z ERROR: DisconnectBlock(): failure reading undo data
2021-10-19T16:23:48Z ERROR: DisconnectTip(): DisconnectBlock dd3a47ab57af27324a54d975b6850967dfffe9d0c0f56dc769465958c6494ae3 failed
2021-10-19T16:23:48Z *** Failed to disconnect block; see debug.log for details
2021-10-19T16:23:48Z Error: Error: A fatal internal error occurred, see debug.log for details
2021-10-19T16:23:48Z ERROR: ProcessNewBlock: ActivateBestChain failed (Failed to disconnect block; see debug.log for details (code 0))
2021-10-19T16:23:48Z ERROR: UndoReadFromDisk: Deserialize or I/O error 2 (23) - CAutoFile::read: end of file: iostream error
2021-10-19T16:23:48Z ERROR: DisconnectBlock(): failure reading undo data
2021-10-19T16:23:48Z ERROR: DisconnectTip(): DisconnectBlock dd3a47ab57af27324a54d975b6850967dfffe9d0c0f56dc769465958c6494ae3 failed
2021-10-19T16:23:48Z *** Failed to disconnect block; see debug.log for details
2021-10-19T16:23:48Z Error: Error: A fatal internal error occurred, see debug.log for details
2021-10-19T16:23:48Z ERROR: ProcessNewBlock: ActivateBestChain failed (Failed to disconnect block; see debug.log for details (code 0))
2021-10-19T16:23:48Z tor: Thread interrupt
2021-10-19T16:23:48Z Shutdown: In progress...
2021-10-19T16:23:48Z torcontrol thread exit
2021-10-19T16:23:48Z opencon thread exit
2021-10-19T16:23:48Z addcon thread exit
2021-10-19T16:23:48Z net thread exit
2021-10-19T16:23:49Z msghand thread exit
2021-10-19T16:23:49Z Pocketcoin Staker terminated
2021-10-19T16:23:49Z [default wallet] keypool return 1
2021-10-19T16:23:49Z scheduler thread interrupt
2021-10-19T16:23:49Z Dumped mempool: 8.7e-05s to copy, 0.004335s to dump
2021-10-19T16:24:14Z Close reindexer DB
2021-10-19T16:24:14Z [default wallet] Releasing wallet
2021-10-19T16:24:14Z Shutdown: done

Node 01:

Additional context Recovering from these events are very painful, it seems that many nodes in the network are all affected by this error simultaneously. My objective with this bug report is to bring awareness to the issue, and gather as much information regarding the problem as possible so it can be addressed.

walkjivefly commented 2 years ago

I had one of these a few days ago but didn't realise until Github notified me of this issue,

2021-10-15T18:24:43Z RIDB rollback to block height 1396634 success!
2021-10-15T18:24:43Z RIDB rollback to block height 1396633 success!
2021-10-15T18:24:43Z 700004: Transaction RI data Not Found (0635c3962e95e660854163837a26afc3e7e5677f47b3b4ddc7f4daca1a5417f8)
2021-10-15T18:24:43Z ERROR: Failed find reindexer transaction part (0635c3962e95e660854163837a26afc3e7e5677f47b3b4ddc7f4daca1a5417f8)
2021-10-15T18:24:43Z ERROR: ConnectTip: ConnectBlock efa10d2c114a072f9e8f2ccba3fe9a8dbd436619430cf275fe99ce0ba0eedd33 failed, failed-find-rtransaction (code 96)
2021-10-15T18:24:44Z 700004: Transaction RI data Not Found (0635c3962e95e660854163837a26afc3e7e5677f47b3b4ddc7f4daca1a5417f8)
2021-10-15T18:24:44Z ERROR: Failed find reindexer transaction part (0635c3962e95e660854163837a26afc3e7e5677f47b3b4ddc7f4daca1a5417f8)
2021-10-15T18:24:44Z ERROR: ConnectTip: ConnectBlock b15d65f324bf35d8c05c76a26a481a7d79a57d5350c1167fb37ed8c735f900f3 failed, failed-find-rtransaction (code 96)
2021-10-15T18:24:44Z Warning! Block generate (CheckInputs): 2fe7af072e6841e61ac2e28aff7b7aee304c00e3c68bca30cc991a73dcbe5820
2021-10-15T18:24:44Z Warning! Block generate (CheckInputs): 1aeb9543793410272cfe73a0da3f9e8627d5732d663892e3a598da9acef8e4a5
2021-10-15T18:24:44Z Warning! Block generate (CheckInputs): 58785d9a0f2466c7a1ab9451326b6186d89d993e8500a15dce34267979a9017e
2021-10-15T18:24:50Z 700004: Transaction RI data Not Found (106fe028500783535b9ad66d2beea61c93c83395e9ff9164860c57e6d811f832)
2021-10-15T18:24:50Z ERROR: Failed find reindexer transaction part (106fe028500783535b9ad66d2beea61c93c83395e9ff9164860c57e6d811f832)
2021-10-15T18:24:50Z ERROR: ConnectTip: ConnectBlock 5a1d4a7b8f21b27b9ea8f9a0cbfeb05c7ae4c1ef5d6d0899017c1b260c7b1c24 failed, failed-find-rtransaction (code 96)
2021-10-15T18:24:50Z Warning! Block generate (CheckInputs): 2fe7af072e6841e61ac2e28aff7b7aee304c00e3c68bca30cc991a73dcbe5820
2021-10-15T18:24:50Z Warning! Block generate (CheckInputs): 1aeb9543793410272cfe73a0da3f9e8627d5732d663892e3a598da9acef8e4a5
2021-10-15T18:24:50Z Warning! Block generate (CheckInputs): 58785d9a0f2466c7a1ab9451326b6186d89d993e8500a15dce34267979a9017e
2021-10-15T18:24:50Z Available coins count 2 BestHeader: 1396634 b15d65f324bf35d8c05c76a26a481a7d79a57d5350c1167fb37ed8c735f900f3
2021-10-15T18:24:50Z ERROR: UndoReadFromDisk: Deserialize or I/O error 2 (23) - ReadVarInt(): size too large: iostream error
2021-10-15T18:24:50Z ERROR: DisconnectBlock(): failure reading undo data
2021-10-15T18:24:50Z ERROR: DisconnectTip(): DisconnectBlock b15d65f324bf35d8c05c76a26a481a7d79a57d5350c1167fb37ed8c735f900f3 failed
2021-10-15T18:24:50Z *** Failed to disconnect block; see debug.log for details
2021-10-15T18:24:50Z Error: Error: A fatal internal error occurred, see debug.log for details
2021-10-15T18:24:50Z ERROR: ProcessNewBlock: ActivateBestChain failed (Failed to disconnect block; see debug.log for details (code 0))
2021-10-15T18:24:50Z tor: Thread interrupt
2021-10-15T18:24:50Z RemoveLocal(lu4l4cocmwelhble.onion:37070)
2021-10-15T18:24:50Z addcon thread exit
2021-10-15T18:24:50Z msghand thread exit
2021-10-15T18:24:50Z torcontrol thread exit
2021-10-15T18:24:50Z opencon thread exit
2021-10-15T18:24:50Z Shutdown: In progress...
2021-10-15T18:24:50Z net thread exit
2021-10-15T18:24:50Z Pocketcoin Staker terminated
2021-10-15T18:24:50Z scheduler thread interrupt
2021-10-15T18:24:50Z [default wallet] keypool return 1
2021-10-15T18:24:50Z Dumped mempool: 0.00051s to copy, 0.003707s to dump
2021-10-15T18:26:32Z Close reindexer DB
2021-10-15T18:26:32Z [default wallet] Releasing wallet
2021-10-15T18:26:32Z Shutdown: done

There's a VERY small chance it was associated with an attempted staking event (node collateral is too low). Recovery was indeed painful since upon restart

...
2021-10-19T22:31:01Z Cache configuration:
2021-10-19T22:31:01Z * Using 2.0MiB for block index database
2021-10-19T22:31:01Z * Using 56.0MiB for transaction index database
2021-10-19T22:31:01Z * Using 8.0MiB for chain state database
2021-10-19T22:31:01Z * Using 139.0MiB for in-memory UTXO set (plus up to 286.1MiB of unused mempool space)
2021-10-19T22:31:01Z init message: Loading block index...
2021-10-19T22:31:01Z Opening LevelDB in /home/masternode/.pocketcoin/blocks/index
2021-10-19T22:31:01Z Opened LevelDB successfully
2021-10-19T22:31:01Z Using obfuscation key for /home/masternode/.pocketcoin/blocks/index: 0000000000000000
2021-10-19T22:31:33Z LoadBlockIndexDB: last block file = 23
2021-10-19T22:31:33Z LoadBlockIndexDB: last block file info: CBlockFileInfo(blocks=1520, size=17594613, heights=1395259...1396636, time=2021-10-14...2021-10-15)
2021-10-19T22:31:33Z Checking all blk files are present...
2021-10-19T22:31:34Z Opening LevelDB in /home/masternode/.pocketcoin/chainstate
2021-10-19T22:31:34Z Opened LevelDB successfully
2021-10-19T22:31:34Z Using obfuscation key for /home/masternode/.pocketcoin/chainstate: b0b61894e8e44e42
2021-10-19T22:31:35Z Loaded best chain: hashBestChain=b15d65f324bf35d8c05c76a26a481a7d79a57d5350c1167fb37ed8c735f900f3 height=1396634 date=2021-10-15T18:20:32Z progress=0.989512
2021-10-19T22:31:35Z init message: Rewinding blocks...
2021-10-19T22:31:49Z init message: Verifying blocks...
2021-10-19T22:31:49Z Check Reindexer DB... success.
2021-10-19T22:32:42Z Verifying last 6 blocks at level 3
2021-10-19T22:32:42Z [0%]...ERROR: UndoReadFromDisk: Deserialize or I/O error 2 (23) - ReadVarInt(): size too large: iostream error
2021-10-19T22:32:42Z ERROR: VerifyDB(): *** found bad undo data at 1396634, hash=b15d65f324bf35d8c05c76a26a481a7d79a57d5350c1167fb37ed8c735f900f3

2021-10-19T22:32:42Z : Corrupted block database detected.
Please restart with -reindex or -reindex-chainstate to recover.
2021-10-19T22:32:42Z Aborted block database rebuild. Exiting.
2021-10-19T22:32:42Z Shutdown: In progress...
2021-10-19T22:32:42Z scheduler thread interrupt
2021-10-19T22:34:10Z Close reindexer DB
2021-10-19T22:34:10Z Shutdown: done

OS: Ubuntu 18.04 uname -a: 4.15.0-135-generic #139-Ubuntu SMP Mon Jan 18 17:38:24 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux vCPUs: 8 RAM: 32GB

dmaltsiniotis commented 2 years ago

It happened again today on Node 02 for no apparent reason:

2021-10-21T20:56:33Z Available coins count 0 BestHeader: 1405357 5cd5f28bf7c2bf817dd7301cc964d5cb48abb73c3b6a7264bdf62842cb3d9810
2021-10-21T20:56:49Z Available coins count 0 BestHeader: 1405357 5cd5f28bf7c2bf817dd7301cc964d5cb48abb73c3b6a7264bdf62842cb3d9810
2021-10-21T20:57:05Z Available coins count 0 BestHeader: 1405357 5cd5f28bf7c2bf817dd7301cc964d5cb48abb73c3b6a7264bdf62842cb3d9810
2021-10-21T20:57:36Z Available coins count 0 BestHeader: 1405358 2d80a32fa073bd47462ad6e4d7fe7928b846bb9f9dfe3813344fbc7ccf99b5fb
2021-10-21T20:57:37Z Available coins count 0 BestHeader: 1405358 2d80a32fa073bd47462ad6e4d7fe7928b846bb9f9dfe3813344fbc7ccf99b5fb
2021-10-21T20:57:40Z RIDB rollback to block height 1405357 success!
2021-10-21T20:57:40Z RIDB rollback to block height 1405356 success!
2021-10-21T20:57:40Z 700004: Transaction RI data Not Found (a4a122d4a8b41e64681cbba7df9867e29dde33430d841b2551e3e9df8ed07376)
2021-10-21T20:57:40Z ERROR: Failed find reindexer transaction part (a4a122d4a8b41e64681cbba7df9867e29dde33430d841b2551e3e9df8ed07376)
2021-10-21T20:57:40Z ERROR: ConnectTip: ConnectBlock 00cf5ffd3c6e3a141cbee755180b10e438c185678726631a38bd7b562a276a45 failed, failed-find-rtransaction (code 96)
2021-10-21T20:57:40Z 700004: Transaction RI data Not Found (a4a122d4a8b41e64681cbba7df9867e29dde33430d841b2551e3e9df8ed07376)
2021-10-21T20:57:40Z ERROR: Failed find reindexer transaction part (a4a122d4a8b41e64681cbba7df9867e29dde33430d841b2551e3e9df8ed07376)
2021-10-21T20:57:40Z ERROR: ConnectTip: ConnectBlock 5cd5f28bf7c2bf817dd7301cc964d5cb48abb73c3b6a7264bdf62842cb3d9810 failed, failed-find-rtransaction (code 96)
2021-10-21T20:57:50Z 700004: Transaction RI data Not Found (4ea6a6d643df0fd40ae011b35f1c11ccd8460a35bc96d49cc790a0824454b743)
2021-10-21T20:57:50Z ERROR: Failed find reindexer transaction part (4ea6a6d643df0fd40ae011b35f1c11ccd8460a35bc96d49cc790a0824454b743)
2021-10-21T20:57:50Z ERROR: ConnectTip: ConnectBlock 2d80a32fa073bd47462ad6e4d7fe7928b846bb9f9dfe3813344fbc7ccf99b5fb failed, failed-find-rtransaction (code 96)
2021-10-21T20:58:13Z socket recv error Connection reset by peer (104)
2021-10-21T20:58:22Z Available coins count 0 BestHeader: 1405358 2d80a32fa073bd47462ad6e4d7fe7928b846bb9f9dfe3813344fbc7ccf99b5fb
2021-10-21T20:58:22Z ERROR: UndoReadFromDisk: Deserialize or I/O error 2 (24) - CAutoFile::read: end of file: iostream error
2021-10-21T20:58:22Z ERROR: DisconnectBlock(): failure reading undo data
2021-10-21T20:58:22Z ERROR: DisconnectTip(): DisconnectBlock 2d80a32fa073bd47462ad6e4d7fe7928b846bb9f9dfe3813344fbc7ccf99b5fb failed
2021-10-21T20:58:22Z *** Failed to disconnect block; see debug.log for details
2021-10-21T20:58:22Z Error: Error: A fatal internal error occurred, see debug.log for details
2021-10-21T20:58:22Z ERROR: ProcessNewBlock: ActivateBestChain failed (Failed to disconnect block; see debug.log for details (code 0))
2021-10-21T20:58:22Z tor: Thread interrupt
2021-10-21T20:58:22Z torcontrol thread exit
2021-10-21T20:58:22Z Shutdown: In progress...
2021-10-21T20:58:22Z opencon thread exit
2021-10-21T20:58:22Z addcon thread exit
2021-10-21T20:58:22Z net thread exit
2021-10-21T20:58:23Z msghand thread exit
2021-10-21T20:58:23Z scheduler thread interrupt
2021-10-21T20:58:23Z Pocketcoin Staker terminated
2021-10-21T20:58:23Z [default wallet] keypool return 1
2021-10-21T20:58:23Z Dumped mempool: 8e-05s to copy, 0.002018s to dump
2021-10-21T20:58:47Z Close reindexer DB
2021-10-21T20:58:47Z [default wallet] Releasing wallet
2021-10-21T20:58:47Z Shutdown: done

When attempting to restart the node, the block data is corrupted:

2021-10-21T23:25:04Z Pocketnet Core version v0.19.16.0-1f9039e (release build)
2021-10-21T23:25:04Z InitParameterInteraction: parameter interaction: -whitelistforcerelay=1 -> setting -whitelistrelay=1
2021-10-21T23:25:04Z Assuming ancestors of block 54434b6405c35c83f37637f3f62ab89e87552ed7e6041e486840c3bb4aa2b2a1 have valid signatures.
2021-10-21T23:25:04Z Setting nMinimumChainWork=00000000000000000000000000000000000000000000002c32faaca6b2922c10
2021-10-21T23:25:04Z Using the 'sse4(1way),sse41(4way),avx2(8way)' SHA256 implementation
2021-10-21T23:25:04Z Using RdRand as an additional entropy source
2021-10-21T23:25:04Z Default data directory /home/darwin/.pocketcoin
2021-10-21T23:25:04Z Using data directory /home/darwin/pocketdata
2021-10-21T23:25:04Z Config file: /home/darwin/pocketcoin.conf
2021-10-21T23:25:04Z Using at most 125 automatic connections (1024 file descriptors available)
2021-10-21T23:25:04Z Using 16 MiB out of 32/2 requested for signature cache, able to store 524288 elements
2021-10-21T23:25:04Z Using 16 MiB out of 32/2 requested for script execution cache, able to store 524288 elements
2021-10-21T23:25:04Z Using 8 threads for script verification
2021-10-21T23:25:04Z scheduler thread start
2021-10-21T23:25:04Z HTTP: creating work queue of depth 1
2021-10-21T23:25:04Z HTTP: creating work queue of depth 100
2021-10-21T23:25:04Z No rpcpassword set - using random cookie authentication.
2021-10-21T23:25:04Z Generated RPC authentication cookie /home/darwin/pocketdata/.cookie
2021-10-21T23:25:04Z HTTP: starting 1 Main worker threads
2021-10-21T23:25:04Z HTTP: starting 8 Public worker threads
2021-10-21T23:25:04Z init message: Loading Reindexer DB...
2021-10-21T23:26:50Z Loaded Reindexer DB (/home/darwin/pocketdata/pocketdb)
2021-10-21T23:26:50Z Using wallet directory /home/darwin/pocketwallets
2021-10-21T23:26:50Z init message: Verifying wallet(s)...
2021-10-21T23:26:50Z Using BerkeleyDB version Berkeley DB 4.8.30: (April  9, 2010)
2021-10-21T23:26:50Z Using wallet wallet.dat
2021-10-21T23:26:50Z BerkeleyEnvironment::Open: LogDir=/home/darwin/pocketwallets/database ErrorFile=/home/darwin/pocketwallets/db.log
2021-10-21T23:26:50Z Cache configuration:
2021-10-21T23:26:50Z * Using 2.0MiB for block index database
2021-10-21T23:26:50Z * Using 8.0MiB for chain state database
2021-10-21T23:26:50Z * Using 160.0MiB for in-memory UTXO set (plus up to 286.1MiB of unused mempool space)
2021-10-21T23:26:50Z init message: Loading block index...
2021-10-21T23:26:50Z Opening LevelDB in /home/darwin/pocketdata/blocks/index
2021-10-21T23:26:50Z Opened LevelDB successfully
2021-10-21T23:26:50Z Using obfuscation key for /home/darwin/pocketdata/blocks/index: 0000000000000000
2021-10-21T23:27:00Z LoadBlockIndexDB: last block file = 24
2021-10-21T23:27:00Z LoadBlockIndexDB: last block file info: CBlockFileInfo(blocks=337, size=6246400, heights=1405055...1405359, time=2021-10-21...2021-10-21)
2021-10-21T23:27:00Z Checking all blk files are present...
2021-10-21T23:27:00Z Opening LevelDB in /home/darwin/pocketdata/chainstate
2021-10-21T23:27:00Z Opened LevelDB successfully
2021-10-21T23:27:00Z Using obfuscation key for /home/darwin/pocketdata/chainstate: b0b61894e8e44e42
2021-10-21T23:27:01Z Loaded best chain: hashBestChain=2d80a32fa073bd47462ad6e4d7fe7928b846bb9f9dfe3813344fbc7ccf99b5fb height=1405358 date=2021-10-21T20:56:48Z progress=0.999746
2021-10-21T23:27:01Z init message: Rewinding blocks...
2021-10-21T23:27:06Z init message: Verifying blocks...
2021-10-21T23:27:06Z Check Reindexer DB... success.
2021-10-21T23:27:15Z Verifying last 6 blocks at level 3
2021-10-21T23:27:15Z [0%]...ERROR: UndoReadFromDisk: Deserialize or I/O error 2 (24) - CAutoFile::read: end of file: iostream error
2021-10-21T23:27:15Z ERROR: VerifyDB(): *** found bad undo data at 1405358, hash=2d80a32fa073bd47462ad6e4d7fe7928b846bb9f9dfe3813344fbc7ccf99b5fb

2021-10-21T23:27:15Z : Corrupted block database detected.
Please restart with -reindex or -reindex-chainstate to recover.
2021-10-21T23:27:15Z Aborted block database rebuild. Exiting.
2021-10-21T23:27:15Z Shutdown: In progress...
2021-10-21T23:27:15Z scheduler thread interrupt
2021-10-21T23:27:43Z Close reindexer DB
2021-10-21T23:27:43Z Shutdown: done

So I guess it's restore from checkpoint time again for Node 02...

tawmaz commented 2 years ago

I am investigating this issue and hope to have a fix ready soon. The bug appears to happen during serialization of transaction data to disk, and the crash will not be hit until the node attempts to switch to a longer chain and load the block transaction undo data from disk. We are currently testing this branch on testnet which ports over the transaction serialization code from bitcoin: https://github.com/pocketnetteam/pocketnet.core/commits/fix/serialization This may fix the issue however we will need to test on mainnet for ~2 weeks to be able to confirm.

dmaltsiniotis commented 2 years ago

I am investigating this issue and hope to have a fix ready soon. The bug appears to happen during serialization of transaction data to disk, and the crash will not be hit until the node attempts to switch to a longer chain and load the block transaction undo data from disk. We are currently testing this branch on testnet which ports over the transaction serialization code from bitcoin: https://github.com/pocketnetteam/pocketnet.core/commits/fix/serialization This may fix the issue however we will need to test on mainnet for ~2 weeks to be able to confirm.

This is great to hear, thank you.

walkjivefly commented 2 years ago

I haven't managed to resync my node yet. reindex-chainstate got about 12/14 done and lost connection with/banned all of it's peers. I'm trying again with a reindex but in light of today's announcement I'm wondering if it's better to simply wait for the new release on Sunday and try a clean sync from genesis?

andyoknen commented 2 years ago

We prepare the database checkpoint after full synchronization on the new code. It will take a little time and therefore we will postpone the release until the evening. We want to make sure that this does not cause new problems, so we continue to test on development nodes.

walkjivefly commented 2 years ago

Had another occurrence about 30 hours after recovering from the snapshot. Really looking forward to that new release!

2021-10-24T21:27:13Z RIDB rollback to block height 1409688 success!
2021-10-24T21:27:14Z 700004: Transaction RI data Not Found (0023e30efdcf1e2e46b1acbb4658abf33389fa6d8ad53a94399bd5fdbf202038)
2021-10-24T21:27:14Z ERROR: Failed find reindexer transaction part (0023e30efdcf1e2e46b1acbb4658abf33389fa6d8ad53a94399bd5fdbf202038)
2021-10-24T21:27:14Z ERROR: ConnectTip: ConnectBlock 0ebe3ac92b1ef3694605146242b39a3d0c8078153c60cacf4fe48f40a8978975 failed, failed-find-rtransaction (code 96)
2021-10-24T21:27:14Z ERROR: UndoReadFromDisk: Deserialize or I/O error 2 (24) - CAutoFile::read: end of file: iostream error
2021-10-24T21:27:14Z ERROR: DisconnectBlock(): failure reading undo data
2021-10-24T21:27:14Z ERROR: DisconnectTip(): DisconnectBlock 0cb54cb731e89ea66032720b3210afb0a8383fba86aa80f58d58cdc26dffedf9 failed
2021-10-24T21:27:14Z *** Failed to disconnect block; see debug.log for details
2021-10-24T21:27:14Z Error: Error: A fatal internal error occurred, see debug.log for details
2021-10-24T21:27:14Z ERROR: ProcessNewBlock: ActivateBestChain failed (Failed to disconnect block; see debug.log for details (code 0))
2021-10-24T21:27:14Z tor: Thread interrupt
2021-10-24T21:27:14Z RemoveLocal(lu4l4cocmwelhble.onion:37070)
2021-10-24T21:27:14Z torcontrol thread exit
2021-10-24T21:27:14Z opencon thread exit
2021-10-24T21:27:14Z addcon thread exit
2021-10-24T21:27:14Z Shutdown: In progress...
2021-10-24T21:27:14Z net thread exit
2021-10-24T21:27:17Z msghand thread exit
2021-10-24T21:27:17Z Pocketcoin Staker terminated
2021-10-24T21:27:17Z scheduler thread interrupt
2021-10-24T21:27:17Z [default wallet] keypool return 1
2021-10-24T21:27:17Z Dumped mempool: 0.000252s to copy, 0.014344s to dump
2021-10-24T21:28:59Z Close reindexer DB
2021-10-24T21:28:59Z [default wallet] Releasing wallet
2021-10-24T21:28:59Z Shutdown: done
andyoknen commented 2 years ago

Fixed in v0.19.17