Thv/fix rollback issue - Githubissues

Sword-Smith commented 5 months ago

~~A draft PR since there are still two tests failing.~~

All tests pass.

This closes #143

Sword-Smith commented 4 months ago

It looks good to me overall.

I ran my stress test, which consists of commenting the sleep calls in run-multiple-instances-from-genesis.sh and then running it. This causes both mining nodes to start at the same time. They usually mine blocks 1 and 2 right away independently, which forces a reorg by one of them when they connect. Sometimes they don't find any peer for some reason, in which case I just shutdown and re-run. Anyway, in 3 successive runs both nodes were synced up by block 3, and once by block 2. That wasn't happening before, so this definitely seems to fix it.

That said, I am still seeing some warnings in the log of the peer that does the reorgs. I'm unsure if these are indicating real problems, or are just normal for a re-org.

Here they are:
$ grep WARN /tmp/integration_test_from_genesis-2.log 
2024-05-08T01:15:56.324546469Z  WARN ThreadId(01) neptune_core::models::state::wallet::wallet_state: Unable to find valid membership proof for UTXO with digest 08806702505947110800,10280140293633150857,05397434293386042537,13540323760646287415,06445496058825839468. UTXO was received at block height 1. Current block height is 1
2024-05-08T01:15:56.356890146Z  WARN ThreadId(01) neptune_core::models::state::wallet::wallet_state: Unable to find valid membership proof for UTXO with digest 08806702505947110800,10280140293633150857,05397434293386042537,13540323760646287415,06445496058825839468. UTXO was received at block height 1. Current block height is 2
2024-05-08T01:16:44.889471372Z  WARN ThreadId(01) neptune_core::models::state: Could not recover MSMP as transaction appears to be on an abandoned chain
2024-05-08T01:17:43.902038027Z  WARN ThreadId(01) neptune_core::models::state: Could not recover MSMP as transaction appears to be on an abandoned chain
2024-05-08T01:18:42.909408732Z  WARN ThreadId(01) neptune_core::models::state: Could not recover MSMP as transaction appears to be on an abandoned chain
2024-05-08T01:19:41.91564508Z  WARN ThreadId(01) neptune_core::models::state: Could not recover MSMP as transaction appears to be on an abandoned chain
I'd like to get feedback on these warnings before approving the PR.

These warnings point to the fact that the client has UTXOs on a branch that is no longer canonical. I think this behavior is correct and shouldn't be fixed. These abandoned UTXOs is why we have the CLI endpoint prune-abandoned-monitored-utxos which the user can run to get rid of these warnigs.

Sword-Smith commented 4 months ago

I ran my stress test, which consists of commenting the sleep calls in run-multiple-instances-from-genesis.sh and then running it. This causes both mining nodes to start at the same time. They usually mine blocks 1 and 2 right away independently, which forces a reorg by one of them when they connect.

Just wanted to add that I really appreciate that you have a way of testing fork resolution.

With this PR, more tests go through the GlobalState's tip-updater method to verify that forks are resolved. Previously only the individual components were tested for fork resolution, not the main state-updater method. That's why this problem wasn't caught by unit tests.

Neptune-Crypto / neptune-core

Thv/fix rollback issue #147