pocketnetteam / pocketnet.core

Decentralized social network based on the blockchain
https://pocketnet.app
Apache License 2.0
114 stars 28 forks source link

EXCEPTION: St13runtime_error after upgrading 22.4 to 22.5 #723

Closed americanpatriotdave closed 2 months ago

americanpatriotdave commented 2 months ago

Describe the bug Pocketnet Core on Ubuntu 22.04 LTS headless The only tweak to the OS is setting ulimit to 65535

Had a good solid node running 0.22.4 for several weeks without issues at all. Node is staked.

Tested 22.5 on a peer node that does not have RPC opened through the firewall; validator only. Worked for several days and still is working.

Did upgrade on other node after

apt -y dist-upgrade ; reboot

pocketcoind -daemon

All seemed fine, then about 5 hours later: " EXCEPTION: St13runtime_error Transaction::List reconstruct failed - no return data for eea0356da7be5......2a6a5 tx "

Restarted OS and restarted daemon without issues.

To Reproduce Unknown

Expected behavior Node to run as well as 22.4

Screenshots If applicable, add screenshots to help explain your problem.

Additional context One other node operator experienced the exact same error, but the "tx" id is different. The "tx" is not a known transaction and gives an "Error 500" on the block explorer written by Demitri. The "Bastyon" block explorer just spins and spins when trying to search for the "offending" transaction ID.

tintin-1929 commented 2 months ago

I've encountered this same issue within the first 24 hours of upgrading from 22.4 --> 22.5

This is a bad bug, node doesn't recover gracefully and user intervention is required to restart node.

From my log:


EXCEPTION: St13runtime_error
Transaction::List reconstruct failed - no return data for eea0356da7be5bbcc69e10839df04f131d291a78afe786f4bf184d0b7d42a6a5 tx
pocketcoin in msghand

tintin-1929 commented 2 months ago

I've hit this defect twice since my initial report two days ago. This is looking like a serious issue to me, I have to manually restart after each crash.

andyoknen commented 2 months ago

Hi! I think this may be related to rejected wallet steak transactions, so far this is just a guess. Since the specified TX is not on the network, could you try to find it on your nodes through the CLI?

pocketcoin-cli gettransaction

andyoknen commented 2 months ago

Also try restarting the node with the -rescan argument. This should remove unnecessary transactions from the wallet, and if that's the problem, then the errors will go away.

the-real-vortex-v commented 2 months ago

Hi! I think this may be related to rejected wallet steak transactions, so far this is just a guess. Since the specified TX is not on the network, could you try to find it on your nodes through the CLI?

pocketcoin-cli gettransaction

Just a suggestion. If this does fix this error, then perhaps for the next release you could add a fix that 1. sets a "dirty transaction flag" that causes a rescan on the next restart of the node and that also adds that into the error message. I.e "EXCEPTION: St13runtime_error Transaction::List reconstruct failed - no return data for eea0356da7be5bbcc69e10839df04f131d291a78afe786f4bf184d0b7d42a6a5 tx - This may be due to a rejected steak transaction. Setting TransactionsDirtyRescan=1. Transactions will be scanned next time pocketcoinD is restarted." or something similar.

You could have a special file called "$config_on_restart.conf" that gets added to the init files via the "includeconf" command that is checked on startup etc. A little cleaner way to do special cases like this.

tintin-1929 commented 2 months ago

Thank you for the comment, I tried this but I'm not sure if I'm doing it correctly. Let me know if I'm not doing something right.

pocketcoin-cli gettransaction eea0356da7be5bbcc69e10839df04f131d291a78afe786f4bf184d0b7d42a6a5 error code: -5 error message: Invalid or non-wallet transaction id

I'm going to restart the node with the -rescan flag now. I'll keep you posted.

tintin-1929 commented 2 months ago

I did the -rescan flag and watched the logs while it worked. I got a ton of "conflicts with wallet transaction" messages in my log where one transaction apparently conflicted with another. Is this normal? After if finished that it looks like the node finished starting up and is now running normally.

andyoknen commented 2 months ago

I did the -rescan flag and watched the logs while it worked. I got a ton of "conflicts with wallet transaction" messages in my log where one transaction apparently conflicted with another. Is this normal? After if finished that it looks like the node finished starting up and is now running normally.

Can you give examples of messages about conflicting transactions?

tintin-1929 commented 2 months ago

I redacted the tx id since I wasn't sure about it, here is an example:

2024-07-25T11:48:48Z [default wallet] Transaction 5e2b ... 6588 (in block 75dbc14aba526dd50ddb73982108b76923584074dab807b632b44b5e9f697b86) conflicts with wallet transaction 3bef ... a343 (both spend 209d ... 1387:2)

This might be unrelated but when I do something like

pocketcoin-cli listtransactions "*" 200 | jq -r '.[] | [.category, .confirmations, .txid, .time] | "(.[0]) \t(.[1])\t(.[2])\t(.[3])"'

I get a bunch of these lines in the log:

2024-07-25T12:09:38Z [default wallet] CWalletTx::GetAmounts: Unknown transaction type found, txid 444a ... c36a

I'm not sure if that's normal, pocketcoin is the only blockchain I've ever used extensively

andyoknen commented 2 months ago

I redacted the tx id since I wasn't sure about it, here is an example:

2024-07-25T11:48:48Z [default wallet] Transaction 5e2b ... 6588 (in block 75dbc14aba526dd50ddb73982108b76923584074dab807b632b44b5e9f697b86) conflicts with wallet transaction 3bef ... a343 (both spend 209d ... 1387:2)

This might be unrelated but when I do something like

pocketcoin-cli listtransactions "*" 200 | jq -r '.[] | [.category, .confirmations, .txid, .time] | "(.[0]) \t(.[1])\t(.[2])\t(.[3])"'

I get a bunch of these lines in the log:

2024-07-25T12:09:38Z [default wallet] CWalletTx::GetAmounts: Unknown transaction type found, txid 444a ... c36a

I'm not sure if that's normal, pocketcoin is the only blockchain I've ever used extensively

Is your balance correct? Is the information from getwalletinfo correct? If so, then my assumption was correct - these are junk transactions in the wallet that should be deleted. rescan is doing just that.

tintin-1929 commented 2 months ago

My node made it over night without a crash.

Just checked my balances, each address in my wallet matches Block Explorer and the totals all match up between pocketcoin-cli getinfo, Block Explorer and the offline ledger that I keep.

Thanks so much for the help Andy!!

tintin-1929 commented 2 months ago

My node made though another night, lets see if it makes it one more night and I'll call it good.

andyoknen commented 2 months ago

My node made though another night, lets see if it makes it one more night and I'll call it good.

I caught a similar problem on one of my servers. I think there are unresolved problems in the work of the mempool - I'm trying to debug and fix them.

tintin-1929 commented 2 months ago

As an update, just checked my node, still running. It's been up 3 day 5 hours 30 minutes and 47 seconds at this point. Thank you Andy. Keep us posted.

the-real-vortex-v commented 2 months ago

I just got this on windows 10:

2024-07-31T11:55:29Z +++ Block connected to chain: 2871576 BH: d4e31c4f0a27105c5b7f0686479e436c9135bc1bd21c5c9ae1eb7a6a003af3de 2024-07-31T11:55:44Z New outbound peer connected: version=/Satoshi:0.22.4/, blocks=2871576, peer=35, peeraddr=188.190.156.10:37070 (block-relay) 2024-07-31T11:55:51Z


EXCEPTION: St13runtime_error
Transaction::List reconstruct failed - no return data for cf1b2cd2d2ba79e238df11796665c9c79c6816c922086af8ada63afb92394e96 tx
D:\Coins\Pocketnet\PocketnetCore\pocketcoin-qt.exe in msghand

2024-07-31T11:57:07Z GUI: Check updates result: error: QNetworkReply::NetworkError(ProtocolUnknownError) 2024-07-31T12:06:43Z Potential stale tip detected, will try using extra outbound peer (last tip update: 674 seconds ago) 2024-07-31T12:15:17Z socket sending timeout: 1201s 2024-07-31T12:15:44Z socket sending timeout: 1201s 2024-07-31T12:15:45Z socket sending timeout: 1201s 2024-07-31T12:15:46Z socket sending timeout: 1201s 2024-07-31T12:15:46Z socket sending timeout: 1201s 2024-07-31T12:15:47Z socket sending timeout: 1201s 2024-07-31T12:15:47Z socket sending timeout: 1201s 2024-07-31T12:15:48Z socket sending timeout: 1201s 2024-07-31T12:15:48Z socket sending timeout: 1201s 2024-07-31T12:15:49Z socket sending timeout: 1201s 2024-07-31T12:17:13Z Potential stale tip detected, will try using extra outbound peer (last tip update: 1304 seconds ago) 2024-07-31T12:27:43Z Potential stale tip detected, will try using extra outbound peer (last tip update: 1934 seconds ago) 2024-07-31T12:38:13Z Potential stale tip detected, will try using extra outbound peer (last tip update: 2564 seconds ago) 2024-07-31T12:48:43Z Potential stale tip detected, will try using extra outbound peer (last tip update: 3194 seconds ago) 2024-07-31T12:57:07Z GUI: Check updates result: error: QNetworkReply::NetworkError(ProtocolUnknownError) 2024-07-31T12:59:13Z Potential stale tip detected, will try using extra outbound peer (last tip update: 3824 seconds ago) 2024-07-31T13:09:43Z Potential stale tip detected, will try using extra outbound peer (last tip update: 4454 seconds ago) 2024-07-31T13:20:13Z Potential stale tip detected, will try using extra outbound peer (last tip update: 5084 seconds ago)

Also popup:

image

I am going to -rescan and see if that helps.

the-real-vortex-v commented 2 months ago

Also try restarting the node with the -rescan argument. This should remove unnecessary transactions from the wallet, and if that's the problem, then the errors will go away.

Rescan no longer removes "unnecessary" transactions and hasn't done so for a number of versions now. I.e It no longer removes cancelled transactions or failed stakes.

In fact it's added a bunch of junk like this one (one example I found) :

Status: 0/unconfirmed, not in memory pool Date: 06-Nov-23 02:56 From: unknown To: PPrS5aFnHW7929LVx3UQKVj7Ek1Rdbgv6P (own address, label: Join) Credit: 2.43770018 PKOIN Net amount: +2.43770018 PKOIN Transaction ID: b148025e886664f6b3fe197e479ab177ab721b12ce3e530bdadeb6f8df488815 Transaction total size: 294 bytes Transaction virtual size: 294 bytes Output index: 0

Debug information

Transaction: CoinstakeCTransaction(hash=b148025e88, nTime=1699214176, ver=2, vin.size=1, vout.size=5, nLockTime=0)

and another :

Status: conflicted with a transaction with 373801 confirmations Date: 06-Nov-23 17:48 From: unknown To: PPrS5aFnHW7929LVx3UQKVj7Ek1Rdbgv6P (own address, label: Join) Credit: 2.43790098 PKOIN Net amount: +2.43790098 PKOIN Transaction ID: 401019d87afa9477247e3cb2030dd6ffe835d9a7022a499817041bce122e7ada Transaction total size: 294 bytes Transaction virtual size: 294 bytes Output index: 0

I have about 200 or more transactions that are ALL dated as 31st of july. It seems like they are mostly the 5 pkoin stake wins from when 5 pkoin was the normal amount.

Also I just got this :

Status: 0/unconfirmed, not in memory pool, abandoned Date: 01-Aug-24 02:03 From: unknown To: PPrS5aFnHW7929LVx3UQKVj7Ek1Rdbgv6P (own address, label: Join) Credit: 2.50020140 PKOIN Net amount: +2.50020140 PKOIN Transaction ID: a1a94cf06b47a37a77f2487eea11c6b98baaaf64c0517ee0b4570e9b38cdfe16 Transaction total size: 224 bytes Transaction virtual size: 224 bytes Output index: 0

A stake win that is "not in memory pool" ... I'm uninstalling this version and going back to 22.4. There's too many issues with this version right now.

tintin-1929 commented 2 months ago

My node crashed again FYI

2024-08-01T08:18:54Z +++ Block connected to chain: 2872804 BH: d6603460691c10b915635059a1445ed53361bb72af8e0efb2ba8ee3d37c7f8cd 2024-08-01T08:18:54Z


EXCEPTION: St13runtime_error
Transaction::List reconstruct failed - no return data for 7755daabf1b0c6f31e65cb39c35c572e0f609f32cd68875519a605bab4df5bd4 tx
pocketcoin in msghand

andyoknen commented 2 months ago

So, now we know the reason for the failure - these are old abandon transactions. Thanks to @the-real-vortex-v for the help. I'm thinking how to solve this problem. It's a shame that I can't reproduce myself, it seems that this node stop with this exception depends on the OS.

andyoknen commented 2 months ago

@the-real-vortex-v I have another plea while I'm running a node in Windows 10 with a wallet and old transactions and staking. Could you run 0.22.5 with disablewallet=0? If the node is stable, the problem is in the staking.

tintin-1929 commented 2 months ago

I'm running Debian 11 (Bullseye) for the record.

americanpatriotdave commented 2 months ago

Would creating a new wallet and recovering from the blockchain (hdseed) cure this in the meantime?

andyoknen commented 2 months ago

Would creating a new wallet and recovering from the blockchain (hdseed) cure this in the meantime?

I think it will help, but I would like to solve the problem first. I've found one vulnerability, and we're testing it now.

https://github.com/pocketnetteam/pocketnet.core/pull/727

https://github.com/pocketnetteam/pocketnet.core/actions/runs/10213997547

the-real-vortex-v commented 2 months ago

@the-real-vortex-v I have another plea while I'm running a node in Windows 10 with a wallet and old transactions and staking. Could you run 0.22.5 with disablewallet=0? If the node is stable, the problem is in the staking.

How long would you expect me to run the node for this way? The crash seems rather random. I have been meaning to get a 2nd box going (linux on an old amd 2700x) for my node stuff. When I do that I can run the node on windows.

andyoknen commented 2 months ago

How long would you expect me to run the node for this way? The crash seems rather random. I have been meaning to get a 2nd box going (linux on an old amd 2700x) for my node stuff. When I do that I can run the node on windows.

My nodes are working stably, the error is really random, I do not know how to catch it. I have to guess for now..

tintin-1929 commented 2 months ago

I just hit it again. Is there any debug information I can gather? I would prefer not having to revert to 0.22.4

andyoknen commented 2 months ago

@tintin-1929 @the-real-vortex-v

At the moment, I am aware of two problems. An unhandled exception when requesting data from a missing transaction and a potential problem due to the change of temporary storage sqlite memory -> disk. I have tried to fix these problems here https://github.com/pocketnetteam/pocketnet.core/pull/727. I also published the ready-made installation packages here: https://dev.pocketnet.app/binaries/core/

I would appreciate any information about checking the test version.

tintin-1929 commented 2 months ago

I don't have time to install at the moment but I'll install it later today.

the-real-vortex-v commented 2 months ago

@tintin-1929 @the-real-vortex-v

At the moment, I am aware of two problems. An unhandled exception when requesting data from a missing transaction and a potential problem due to the change of temporary storage sqlite memory -> disk. I have tried to fix these problems here #727. I also published the ready-made installation packages here: https://dev.pocketnet.app/binaries/core/

I would appreciate any information about checking the test version.

I've installed and run the beta build. We'll see what happens.

As an aside. I have disconnectold="1" set in the config file (my debug.log shows it being loaded). I still have v0.22.4 nodes connecting to my node. Which versions are blocked? Perhaps a little debug info should be printed when this is enabled like "Disconnecting old nodes is enabled. Node will now auto disconnect and ban nodes of version XYZ and lower".

tintin-1929 commented 2 months ago

I hope I don't regret but I've installed the private build, will report in the morning (US Eastern Time)

tintin-1929 commented 2 months ago

My node seems to have run fine overnight.

Another strange thing happened on 8/5, I had the first day with no stake reward in 79 days and then another no stake reward day on 8/8. I don't know what it means but I thought I'd mention it because I consider it an anomaly at this point.

the-real-vortex-v commented 2 months ago

Well so far no crashing. It's been up since:

2024-08-07T15:17:53Z Pocketnet Core version 0.22.5-f77cf58a (release build)

it's is now: 2024-08-09T19:49:59Z +++ Block connected to chain: 2885065 BH: 105b13946b77d6aeb8ed55fc11b9537f7822abf7fdd29c97b3e659ea7144b300

No crashing so far.

tintin-1929 commented 2 months ago

Checked on my node this morning, still running. "pocketcoin-cli uptime" reports 133219 which is about 1 day 13 hours

tintin-1929 commented 2 months ago

My node is still up, things look normal. Thanks Andy.

andyoknen commented 2 months ago

Great! Thanks @americanpatriotdave @tintin-1929 @the-real-vortex-v For your help! The official release of 0.22.6 will be this week, when I finish a couple of other tasks.

the-real-vortex-v commented 2 months ago

Just a minor update. Still no crashing since the last post.

uptime is: 405972

the-real-vortex-v commented 2 months ago

Great! Thanks @americanpatriotdave @tintin-1929 @the-real-vortex-v For your help! The official release of 0.22.6 will be this week, when I finish a couple of other tasks.

Just a quick question. Will the 0.22.6 update include any additional changes?

andyoknen commented 2 months ago

Just a quick question. Will the 0.22.6 update include any additional changes?

Yes, version 0.22.6 includes a fix from the test build that I suggested to install.