Closed walkjivefly closed 2 years ago
@walkjivefly,
I also had this exact same error today (about 6 hours ago), and not only that but at the exact same time and block!
2021-10-29T10:20:02Z Available coins count 6 BestHeader: 1416182 c16ca39fbe6f0c67271cd2e253206fe2fbbdd011c68c86e5fac46b0c22b599e9
2021-10-29T10:20:17Z Available coins count 6 BestHeader: 1416182 c16ca39fbe6f0c67271cd2e253206fe2fbbdd011c68c86e5fac46b0c22b599e9
2021-10-29T10:20:45Z +++ Block connected to chain: 1416183 BH: 2952546ac4399c66a654b5795e0df91be638cbb1d1a39dace6a3b6eb6e7bbdc5 CC: 0 (0)
2021-10-29T10:20:45Z Available coins count 6 BestHeader: 1416183 2952546ac4399c66a654b5795e0df91be638cbb1d1a39dace6a3b6eb6e7bbdc5
2021-10-29T10:20:45Z WARNING! Receive transaction, antibot check: 1 fb3cba13c3d67c480c2e36c885e410770baeafab10e666351a4c8bd4b7775ef0
2021-10-29T10:20:45Z WARNING! Receive transaction, antibot check: 1 d129afe9008d56e5f59e1d44ee7ecbccf2417c1b9b49d49917d9def68dd87c78
2021-10-29T10:20:45Z WARNING! Receive transaction, antibot check: 1 d444b9b1d8960927b58ca91e5abac8ea89a047cda8b168ec63958bd8adc99457
2021-10-29T10:20:45Z WARNING! Receive transaction, antibot check: 1 2b701ab8f513c6c79b8eb23eaf4580ff20823ce2d095bcd05b0087f15b8233c8
2021-10-29T10:20:45Z WARNING! Receive transaction, antibot check: 1 607c3e55dee08e5fd4b3048f8488b8732c9313bbcbb2cdfe60e20780a3b5ae17
2021-10-29T10:20:49Z Available coins count 6 BestHeader: 1416183 2952546ac4399c66a654b5795e0df91be638cbb1d1a39dace6a3b6eb6e7bbdc5
2021-10-29T10:21:05Z Available coins count 6 BestHeader: 1416183 2952546ac4399c66a654b5795e0df91be638cbb1d1a39dace6a3b6eb6e7bbdc5
2021-10-29T10:21:12Z RIDB rollback to block height 1416182 success!
2021-10-29T10:21:43Z +++ Block connected to chain: 1416183 BH: 23d65f9a8ecec9dba9cecfd4f4408c02d9e73b0aed2e95c7b319a2b01710da16 CC: 0 (0)
2021-10-29T10:21:43Z +++ Block connected to chain: 1416184 BH: 69619826941fb15f7bf381b1114dd6f4d14e4849b6b0b14773309968ce2b1a16 CC: 0 (0)
2021-10-29T10:21:46Z Available coins count 6 BestHeader: 1416184 69619826941fb15f7bf381b1114dd6f4d14e4849b6b0b14773309968ce2b1a16
2021-10-29T10:21:47Z RIDB rollback to block height 1416183 success!
2021-10-29T10:21:57Z RIDB rollback to block height 1416182 success!
2021-10-29T10:21:57Z 700004: Transaction RI data Not Found (9770aecfa203504b2d6fc81dc18f093ee5742e96cd01c5f8695c0365bb34f038)
2021-10-29T10:21:57Z ERROR: Failed find reindexer transaction part (9770aecfa203504b2d6fc81dc18f093ee5742e96cd01c5f8695c0365bb34f038)
2021-10-29T10:21:57Z ERROR: ConnectTip: ConnectBlock 2952546ac4399c66a654b5795e0df91be638cbb1d1a39dace6a3b6eb6e7bbdc5 failed, failed-find-rtransaction (code 96)
2021-10-29T10:22:29Z +++ Block connected to chain: 1416183 BH: 23d65f9a8ecec9dba9cecfd4f4408c02d9e73b0aed2e95c7b319a2b01710da16 CC: 0 (0)
2021-10-29T10:22:29Z 700004: Transaction RI data Not Found (66c56281d3eaa1330005a2b44105705d3380017fce3862dec39850bd628e2d90)
2021-10-29T10:22:29Z ERROR: Failed find reindexer transaction part (66c56281d3eaa1330005a2b44105705d3380017fce3862dec39850bd628e2d90)
2021-10-29T10:22:29Z ERROR: ConnectTip: ConnectBlock 69619826941fb15f7bf381b1114dd6f4d14e4849b6b0b14773309968ce2b1a16 failed, failed-find-rtransaction (code 96)
2021-10-29T10:22:29Z Available coins count 6 BestHeader: 1416183 23d65f9a8ecec9dba9cecfd4f4408c02d9e73b0aed2e95c7b319a2b01710da16
2021-10-29T10:22:29Z RIDB rollback to block height 1416182 success!
2021-10-29T10:22:54Z +++ Block connected to chain: 1416183 BH: 2952546ac4399c66a654b5795e0df91be638cbb1d1a39dace6a3b6eb6e7bbdc5 CC: 0 (0)
2021-10-29T10:23:05Z +++ Block connected to chain: 1416184 BH: 1e15d989cb170419a2eb91581ab146d4c0b0195386b6f15abfcc9adf4fc93dd6 CC: 0 (0)
2021-10-29T10:23:06Z Available coins count 6 BestHeader: 1416184 1e15d989cb170419a2eb91581ab146d4c0b0195386b6f15abfcc9adf4fc93dd6
2021-10-29T10:23:07Z WARNING! Receive transaction, antibot check: 28 aee2d88ba1126608dedf31f449cde888fa3836f62400c0ce6a4fef4115b38c7f
2021-10-29T10:23:12Z RIDB rollback to block height 1416183 success!
2021-10-29T10:23:12Z ERROR: UndoReadFromDisk: Deserialize or I/O error 2 (24) - ReadVarInt(): size too large: iostream error
2021-10-29T10:23:12Z ERROR: DisconnectBlock(): failure reading undo data
2021-10-29T10:23:12Z ERROR: DisconnectTip(): DisconnectBlock 2952546ac4399c66a654b5795e0df91be638cbb1d1a39dace6a3b6eb6e7bbdc5 failed
2021-10-29T10:23:12Z *** Failed to disconnect block; see debug.log for details
2021-10-29T10:23:12Z Error: Error: A fatal internal error occurred, see debug.log for details
2021-10-29T10:23:12Z ERROR: ProcessNewBlock: ActivateBestChain failed (Failed to disconnect block; see debug.log for details (code 0))
2021-10-29T10:23:12Z tor: Thread interrupt
2021-10-29T10:23:12Z Shutdown: In progress...
2021-10-29T10:23:12Z torcontrol thread exit
2021-10-29T10:23:12Z opencon thread exit
2021-10-29T10:23:12Z addcon thread exit
2021-10-29T10:23:13Z net thread exit
2021-10-29T10:23:14Z Available coins count 6 BestHeader: 1416183 2952546ac4399c66a654b5795e0df91be638cbb1d1a39dace6a3b6eb6e7bbdc5
2021-10-29T10:23:14Z msghand thread exit
2021-10-29T10:23:14Z scheduler thread interrupt
2021-10-29T10:23:14Z Pocketcoin Staker terminated
2021-10-29T10:23:14Z [Staking Wallet] keypool return 1
2021-10-29T10:23:45Z Close reindexer DB
2021-10-29T10:23:45Z [Staking Wallet] Releasing wallet
2021-10-29T10:23:45Z Shutdown: done
Oh joy! Restart from snapshot time - again.
2021-10-29T16:24:39Z [0%]...ERROR: UndoReadFromDisk: Deserialize or I/O error 2 (24) - ReadVarInt(): size too large: iostream error
2021-10-29T16:24:39Z ERROR: VerifyDB(): *** found bad undo data at 1416183, hash=2952546ac4399c66a654b5795e0df91be638cbb1d1a39dace6a3b6eb6e7bbdc5
2021-10-29T16:24:39Z : Corrupted block database detected.
Please restart with -reindex or -reindex-chainstate to recover.
2021-10-29T16:24:39Z Aborted block database rebuild. Exiting.
2021-10-29T16:24:39Z Shutdown: In progress...
2021-10-29T16:24:39Z scheduler thread interrupt
2021-10-29T16:26:26Z Close reindexer DB
2021-10-29T16:26:26Z Shutdown: done
This is the same signature as the DisconnectTip/BlockUndo issue we have seen in previous versions. In 0.19.17 we ported over the recent serialization code from Bitcoin core which replaced some suspect serialization code used in previous versions. Since the issue is still occurring the problem must lie elsewhere in the code.
I am writing a check in UndoWriteToDisk to verify the CBlockUndo data can be deserialized correctly immediately after it is written to disk, and throw a trap/int 3 if they do not match so we can break in with gdb. The original CBlockUndo data will then be present on the stack for comparison. Since the check is on the write path we can verify the block immediately when it is written rather than wait for the node to change tips. Reproduction time should be much less.
@tawmaz I can run a node in gdb with these developments, if necessary.
I ran a node over the weekend with GDB attached but did not hit the UndoReadFromDisk error. I think it may be necessary for the node to be actively staking to reproduce this, so I will send some coins to it and let it continue. I will plan to post a debug build this evening with the code to catch the UndoWriteToDiskError.
@tawmaz I think that may be an incorrect hypothesis. My node has a balance of less than 12 PKOIN - still hoping/waiting for the 70 PKOIN perseverance reward. It hasn't been staking for weeks because with an expected time to win in excess of 20 million there's really no point in me remembering to unlock the wallet on each restart. It absolutely was not staking when it hit the problem which prompted me to open this issue.
@walkjivefly Thanks for the datapoint regarding staking
Here is the debug version with validation of CBlockUndo: https://github.com/tawmaz/pocketnet.core/releases/tag/v0.19.17blockundodbg It will raise SIGTRAP if it encounters an issue so gdb will need to be attached.
I've hit this several times now. I'm currently running 19.19, would it be helpful if I ran the debug version? I might need help with gdb since I'm new to linux. I am more than happy to do/learn whatever I can for the community
Here are the logs to show what I've encountered.
2021-11-06T08:50:32Z Warning! Block generate (CheckInputs): 59b18a5a2099cebc6f9eda8d593696fab90a975741c8c6b2d0dfb3ad91d3f390 2021-11-06T08:50:33Z RIDB rollback to block height 1427528 success! 2021-11-06T08:50:33Z RIDB rollback to block height 1427527 success! 2021-11-06T08:50:33Z 700004: Transaction RI data Not Found (1ce5b2bb23b548293741f3f63946ed3b7a7a9c9e2f8b2b21c774c8f8575ea3be) 2021-11-06T08:50:33Z ERROR: Failed find reindexer transaction part (1ce5b2bb23b548293741f3f63946ed3b7a7a9c9e2f8b2b21c774c8f8575ea3be) 2021-11-06T08:50:33Z ERROR: ConnectTip: ConnectBlock b573fc243549e24d10a5f810eaaa33235a6518ee0be348544467656ba20e62e0 failed, failed-find-rtransaction (code 96) 2021-11-06T08:50:33Z 700004: Transaction RI data Not Found (1ce5b2bb23b548293741f3f63946ed3b7a7a9c9e2f8b2b21c774c8f8575ea3be) 2021-11-06T08:50:33Z ERROR: Failed find reindexer transaction part (1ce5b2bb23b548293741f3f63946ed3b7a7a9c9e2f8b2b21c774c8f8575ea3be) 2021-11-06T08:50:33Z ERROR: ConnectTip: ConnectBlock 5e13ec73d550758a7183f295bbefd8e67104487bf7a054bdb7521bde96c65ce2 failed, failed-find-rtransaction (code 96) 2021-11-06T08:50:33Z Warning! Block generate (CheckInputs): 572f483d6052cba4648c9e4d6b81855822b5f8b09fe6b036580f11eb04cb8024 2021-11-06T08:50:33Z Warning! Block generate (CheckInputs): 59b18a5a2099cebc6f9eda8d593696fab90a975741c8c6b2d0dfb3ad91d3f390 2021-11-06T08:50:33Z Warning! Block generate (CheckInputs): 3c4d0196529cc046d3d0c7d193f4b87c3393f9503572b2c9c2f4f8f5dbdbb9ea 2021-11-06T08:50:33Z Warning! Block generate (CheckInputs): 93250f2c4bcc8747fba7d811716f10c3c992665fed48cc7d35c5ccf04da882be 2021-11-06T08:50:33Z Warning! Block generate (CheckInputs): ad69bdcb09d6e6ac4467750187a2ea26ac07c2c30fdce8badb655a325c2b2cd9 2021-11-06T08:50:35Z Warning! Block generate (CheckInputs): 572f483d6052cba4648c9e4d6b81855822b5f8b09fe6b036580f11eb04cb8024 2021-11-06T08:50:35Z Warning! Block generate (CheckInputs): 59b18a5a2099cebc6f9eda8d593696fab90a975741c8c6b2d0dfb3ad91d3f390 2021-11-06T08:50:35Z Warning! Block generate (CheckInputs): 3c4d0196529cc046d3d0c7d193f4b87c3393f9503572b2c9c2f4f8f5dbdbb9ea 2021-11-06T08:50:35Z Warning! Block generate (CheckInputs): 93250f2c4bcc8747fba7d811716f10c3c992665fed48cc7d35c5ccf04da882be 2021-11-06T08:50:35Z Warning! Block generate (CheckInputs): ad69bdcb09d6e6ac4467750187a2ea26ac07c2c30fdce8badb655a325c2b2cd9 2021-11-06T08:50:35Z Warning! Block generate (CheckInputs): 572f483d6052cba4648c9e4d6b81855822b5f8b09fe6b036580f11eb04cb8024 2021-11-06T08:50:35Z Warning! Block generate (CheckInputs): 59b18a5a2099cebc6f9eda8d593696fab90a975741c8c6b2d0dfb3ad91d3f390 2021-11-06T08:50:35Z Warning! Block generate (CheckInputs): 3c4d0196529cc046d3d0c7d193f4b87c3393f9503572b2c9c2f4f8f5dbdbb9ea 2021-11-06T08:50:35Z Warning! Block generate (CheckInputs): 93250f2c4bcc8747fba7d811716f10c3c992665fed48cc7d35c5ccf04da882be 2021-11-06T08:50:35Z Warning! Block generate (CheckInputs): ad69bdcb09d6e6ac4467750187a2ea26ac07c2c30fdce8badb655a325c2b2cd9 2021-11-06T08:50:43Z Warning! Block generate (CheckInputs): 59b18a5a2099cebc6f9eda8d593696fab90a975741c8c6b2d0dfb3ad91d3f390 2021-11-06T08:50:43Z Available coins count 16 BestHeader: 1427529 5e13ec73d550758a7183f295bbefd8e67104487bf7a054bdb7521bde96c65ce2 2021-11-06T08:50:43Z ERROR: UndoReadFromDisk: Deserialize or I/O error 2 (25) - CAutoFile::read: end of file: iostream error 2021-11-06T08:50:43Z ERROR: DisconnectBlock(): failure reading undo data 2021-11-06T08:50:43Z ERROR: DisconnectTip(): DisconnectBlock 5e13ec73d550758a7183f295bbefd8e67104487bf7a054bdb7521bde96c65ce2 failed 2021-11-06T08:50:43Z Failed to disconnect block; see debug.log for details 2021-11-06T08:50:43Z Error: Error: A fatal internal error occurred, see debug.log for details 2021-11-06T08:50:43Z ERROR: ProcessNewBlock: ActivateBestChain failed (Failed to disconnect block; see debug.log for details (code 0)) 2021-11-06T08:50:43Z ERROR: UndoReadFromDisk: Deserialize or I/O error 2 (25) - CAutoFile::read: end of file: iostream error 2021-11-06T08:50:43Z ERROR: DisconnectBlock(): failure reading undo data 2021-11-06T08:50:43Z ERROR: DisconnectTip(): DisconnectBlock 5e13ec73d550758a7183f295bbefd8e67104487bf7a054bdb7521bde96c65ce2 failed 2021-11-06T08:50:43Z Failed to disconnect block; see debug.log for details 2021-11-06T08:50:43Z Error: Error: A fatal internal error occurred, see debug.log for details 2021-11-06T08:50:43Z ERROR: ProcessNewBlock: ActivateBestChain failed (Failed to disconnect block; see debug.log for details (code 0)) 2021-11-06T08:50:43Z tor: Thread interrupt
I attempted to catch this issue on serialization of the BlockUndo data last week, but instead hit the Reindexer segfault described in issue 80. I will try creating binary which verifies unserialize during serialize and running it to see if we can catch the issue when the Block object is still available. I also saved the blocks from a node which crashed this morning which I will investigate to see if I find anything new.
In both cases I have observed of this issue the error occurs on decompression and deserialization of the script code. For the incident that occurred this morning at block 1427557, the nSize value here was deserialized with a value of 69095430 (69MB)! This causes CHashVerifier::ignore() to read past the end of the file. Previously I attempted to run this patch to attempt to catch to problem on undo file write, but it was not able to catch this error over four days. Will revisit the patch and try again.
Relevant Locals: pindex->nHeight=1427557 pindex->nFile=25 nSize = 69095430
Call stack:
CHashVerifier<CAutoFile>::ignore(CHashVerifier<CAutoFile> * const this, size_t nSize) (/src/pocketnet.core.reindexer/src/hash.h:171)
ScriptCompression::Unser<CHashVerifier<CAutoFile> >(ScriptCompression * const this, CHashVerifier<CAutoFile> & s, CScript & script) (/src/pocketnet.core.reindexer/src/compressor.h:91)
Wrapper<ScriptCompression, CScript&>::Unserialize<CHashVerifier<CAutoFile> >(Wrapper<ScriptCompression, CScript&> * const this, CHashVerifier<CAutoFile> & s) (src/pocketnet.core.reindexer/src/serialize.h:454)
Unserialize<CHashVerifier<CAutoFile>, Wrapper<ScriptCompression, CScript&>&>(CHashVerifier<CAutoFile> & is, Wrapper<ScriptCompression, CScript&> & a) (src/pocketnet.core.reindexer/src/serialize.h:704)
UnserializeMany<CHashVerifier<CAutoFile>, Wrapper<ScriptCompression, CScript&>&>(CHashVerifier<CAutoFile> & s, Wrapper<ScriptCompression, CScript&> & arg) (src/pocketnet.core.reindexer/src/serialize.h:1054)
UnserializeMany<CHashVerifier<CAutoFile>, Wrapper<AmountCompression, long&>&, Wrapper<ScriptCompression, CScript&>&>(CHashVerifier<CAutoFile> & s, Wrapper<AmountCompression, long&> & arg) (/src/pocketnet.core.reindexer/src/serialize.h:1055)
SerReadWriteMany<CHashVerifier<CAutoFile>, Wrapper<AmountCompression, long&>, Wrapper<ScriptCompression, CScript&> >(CHashVerifier<CAutoFile> & s, CSerActionUnserialize ser_action) (/src/pocketnet.core.reindexer/src/serialize.h:1067)
TxOutCompression::SerializationOps<CHashVerifier<CAutoFile>, CTxOut, CSerActionUnserialize>(CTxOut & obj, CHashVerifier<CAutoFile> & s, CSerActionUnserialize ser_action) (/src/pocketnet.core.reindexer/src/compressor.h:116)
TxOutCompression::Unser<CHashVerifier<CAutoFile> >(CHashVerifier<CAutoFile> & s, CTxOut & obj) (/src/pocketnet.core.reindexer/src/compressor.h:116)
Wrapper<TxOutCompression, CTxOut&>::Unserialize<CHashVerifier<CAutoFile> >(Wrapper<TxOutCompression, CTxOut&> * const this, CHashVerifier<CAutoFile> & s) (/src/pocketnet.core.reindexer/src/serialize.h:454)
Unserialize<CHashVerifier<CAutoFile>, Wrapper<TxOutCompression, CTxOut&> >(CHashVerifier<CAutoFile> & is, Wrapper<TxOutCompression, CTxOut&> && a) (/src/pocketnet.core.reindexer/src/serialize.h:704)
TxInUndoFormatter::Unser<CHashVerifier<CAutoFile> >(TxInUndoFormatter * const this, CHashVerifier<CAutoFile> & s, Coin & txout) (/src/pocketnet.core.reindexer/src/undo.h:47)
VectorFormatter<TxInUndoFormatter>::Unser<CHashVerifier<CAutoFile>, std::vector<Coin, std::allocator<Coin> > >(VectorFormatter<TxInUndoFormatter> * const this, CHashVerifier<CAutoFile> & s, std::vector<Coin, std::allocator<Coin> > & v) (/src/pocketnet.core.reindexer/src/serialize.h:621)
Wrapper<VectorFormatter<TxInUndoFormatter>, std::vector<Coin, std::allocator<Coin> >&>::Unserialize<CHashVerifier<CAutoFile> >(Wrapper<VectorFormatter<TxInUndoFormatter>, std::vector<Coin, std::allocator<Coin> >&> * const this, CHashVerifier<CAutoFile> & s) (/src/pocketnet.core.reindexer/src/serialize.h:454)
Unserialize<CHashVerifier<CAutoFile>, Wrapper<VectorFormatter<TxInUndoFormatter>, std::vector<Coin, std::allocator<Coin> >&>&>(CHashVerifier<CAutoFile> & is, Wrapper<VectorFormatter<TxInUndoFormatter>, std::vector<Coin, std::allocator<Coin> >&> & a) (/src/pocketnet.core.reindexer/src/serialize.h:704)
UnserializeMany<CHashVerifier<CAutoFile>, Wrapper<VectorFormatter<TxInUndoFormatter>, std::vector<Coin, std::allocator<Coin> >&>&>(CHashVerifier<CAutoFile> & s, Wrapper<VectorFormatter<TxInUndoFormatter>, std::vector<Coin, std::allocator<Coin> >&> & arg) (/src/pocketnet.core.reindexer/src/serialize.h:1054)
SerReadWriteMany<CHashVerifier<CAutoFile>, Wrapper<VectorFormatter<TxInUndoFormatter>, std::vector<Coin, std::allocator<Coin> >&> >(CHashVerifier<CAutoFile> & s, CSerActionUnserialize ser_action) (/src/pocketnet.core.reindexer/src/serialize.h:1067)
CTxUndo::SerializationOps<CHashVerifier<CAutoFile>, CTxUndo, CSerActionUnserialize>(CTxUndo & obj, CHashVerifier<CAutoFile> & s, CSerActionUnserialize ser_action) (/src/pocketnet.core.reindexer/src/undo.h:61)
CTxUndo::Unser<CHashVerifier<CAutoFile> >(CHashVerifier<CAutoFile> & s, CTxUndo & obj) (/src/pocketnet.core.reindexer/src/undo.h:61)
CTxUndo::Unserialize<CHashVerifier<CAutoFile> >(CTxUndo * const this, CHashVerifier<CAutoFile> & s) (/src/pocketnet.core.reindexer/src/undo.h:61)
The synchronization issue has been fixed in version 0.20.*, which is in pre-release state and is being prepared for release.
Problem v0.19.17 daemon shut itself down with the following candidate errors in debug.log:
To Reproduce Unknown
Expected behavior daemon should run without uncommanded shutdowns or fatal errors.
Desktop (please complete the following information):
Additional context There was an RIDB error but it didn't claim to be fatal and didn't cause a crash. A few seconds later there was a successful RIDB rollback immediately followed by a fatal DisconnectTip() error, and then another fatal DisconnectTip() for the same block. A few lines later the node started an orderly shutdown.
I was running with only net,rpc,reindex logging enabled so there may have been something other than these candidate events which caused the shutdown.
Full log is available if required.