Replay optimization to avoid adding to de-duplication index could leave index in inconsistent state

arhag commented 1 year ago

I am referring to the replay optimization that is trigger by the status of the following boolean: https://github.com/AntelopeIO/leap/blob/a8d48d406082cc6ec5ef18dddd30b5589b3e870a/libraries/chain/controller.cpp#L1531

It assumes that replay of irreversible blocks will complete up to and including a block with a timestamp that is greater than or equal to replay_head_time. This assumption holds if one starts nodeos with a blocks.log file and allows it to finish the replay.

However, we allow the user to interrupt the replay without leaving the chainbase database in a dirty state. The intention is that they can then continue the replay later. But nothing prevents them from truncating the blocks.log and finishing the replay to a block that has a timestamp strictly less than the original replay_head_time timestamp.

This can leave nodeos available for syncing and responding to API requests in a state in which its transaction de-duplication index is not in the same state (even after ignoring the entries that should have expired as of that head block time; see https://github.com/AntelopeIO/leap/issues/290).

This shouldn't be a practical safety issue because, assuming no finality violations, the only way the node will reach a block containing potentially malicious duplicate transactions that should be rejected is if they first replay through the branch of the blockchain (containing the irreversible and already known to be good blocks) that it would have originally replayed had it not been interrupted by the operator. And by the time it gets through those blocks, the state of the de-duplication index should correct itself in time to reject the block with the duplicate transaction.

Nevertheless, to provide protection to the nodes even under this usual sequence of events and against an attacker who could violate the finality of blocks, it is best to prevent a correct node from getting to a state where the chainbase database is not dirty and yet has consensus state that differs from other correct nodes in the network that are at the same head block. This also solves a more practical matter that even without finality violation a node that gets into this state could then be used to generate portable snapshots which differ from the ones generated by other nodes for the same block.

So resolving this issue along with https://github.com/AntelopeIO/leap/issues/290 also fixes the additional ways that are currently known to create an inconsistent portable snapshot beyond the ones that were tracked by https://github.com/eosnetworkfoundation/mandel/issues/301 and fixed by https://github.com/AntelopeIO/leap/pull/273.

One very simple way to fix the undesired behavior of the replay optimization is to simply get rid of that replay optimization by hard-coding skip_record to false. https://github.com/AntelopeIO/leap/blob/a8d48d406082cc6ec5ef18dddd30b5589b3e870a/libraries/chain/controller.cpp#L1531

But we may not wish to completely remove the replay optimization. We could instead resolve the issue with the replay optimization while some its performance benefits by keeping an ephemeral transaction de-duplication store outside of chainbase during the replay of irreversible blocks. This ephemeral store would be modified as blocks were replayed to record entries as transactions were processed and also to remove any expired entries from the ephemeral store as blocks were processed. And since there is no issue with duplicate transactions when replaying already validated blocks, the store would not even need an index to quickly look up entries by transaction ID; it would just need an index on the expiration time to make it quick to remove expired entries during the replay to prevent the size of the store from growing too large. Then when exiting the replay of irreversible blocks (whether because it is done or because it exited early due to a signal provided to a process), nodeos would ensure that it mutates the chainbase database state to capture within its transaction de-duplication index all the remaining entries within the ephemeral store. After that point, the ephemeral store can be discarded and nodeos can continue forward with the next step in the process, whether that is replaying reversible blocks or just safely exiting.

[x] #297
[ ] Modify replay optimization to leave index in a consistent state
[x] Performance benchmark between original optimization and modified optimization

heifner commented 1 year ago

clear_expired_input_transactions already uses a fc::time_point::maximum() for deadline during replay, so that work is not needed.

For the ephemeral transaction de-duplication store this approach could be used: https://github.com/AntelopeIO/leap/issues/279

It is likely worth determining how much difference this optimization makes. Another approach is to allow the ctrl-c stop of replay but put a dirty-like flag in chainbase that only allows continuation of replay but no other startup.

arhag commented 1 year ago

@heifner: Ah, great that it already uses maximum during replay. I edited the description above to simplify given that it already does that.

heifner commented 1 year ago

Since #279 shows no difference in replay performance for EOS mainnet test range of blocks, will just go with skip_record = false approach.

AntelopeIO / leap

Replay optimization to avoid adding to de-duplication index could leave index in inconsistent state #291