2163 makes the ChainWorkerActor be reused across requests to the WorkerState. This improves performance because the ChainStateView does not need to be reloaded for every request. However, the PR was causing the test_open_chain_node_service test would fail, and the normal logs would show that the validators rejected a block because messages were out of order.
The trace log however, would show that first the validators would reject a block because it was missing the incoming messages (this is normal). After the cross-chain messages had been propagated, reproposing the block failed because the validator complained that those messages were already received.
The cause was that the chain state was not being properly discarded on error conditions. The ChainWorkerActor owns a ChainWorkerState, which has the ChainStateView instance. When the ChainWorkerState returns an error, it did not rollback the changes to the ChainStateView, so they were kept in memory. When the next request was handled, the state would start off with invalid data.
Proposal
Force rollback to be done automatically with some helper types. ChainWorkerStateWithTemporaryChanges always rolls back all changes when it is dropped. ChainWorkerStateWithAttemptedChanges has a save method, which persists the changes. If the changes aren't persisted (e.g., when an error occurs) then the ChainStateView has its changes rolled back when the wrapper type is dropped.
Both types are wrappers around &mut ChainWorkerState, and simply represent the "lifetime" of a transaction of changes to the chain state.
Test Plan
After this PR and with #2163, the test_open_chain_node_service stopped failing. Without this PR, it fails consistently.
Release Plan
This is an internal refactor, so nothing is needed.
Motivation
2163 makes the
ChainWorkerActor
be reused across requests to theWorkerState
. This improves performance because theChainStateView
does not need to be reloaded for every request. However, the PR was causing thetest_open_chain_node_service
test would fail, and the normal logs would show that the validators rejected a block because messages were out of order.The trace log however, would show that first the validators would reject a block because it was missing the incoming messages (this is normal). After the cross-chain messages had been propagated, reproposing the block failed because the validator complained that those messages were already received.
The cause was that the chain state was not being properly discarded on error conditions. The
ChainWorkerActor
owns aChainWorkerState
, which has theChainStateView
instance. When theChainWorkerState
returns an error, it did not rollback the changes to theChainStateView
, so they were kept in memory. When the next request was handled, the state would start off with invalid data.Proposal
Force rollback to be done automatically with some helper types.
ChainWorkerStateWithTemporaryChanges
always rolls back all changes when it is dropped.ChainWorkerStateWithAttemptedChanges
has asave
method, which persists the changes. If the changes aren't persisted (e.g., when an error occurs) then theChainStateView
has its changes rolled back when the wrapper type is dropped.Both types are wrappers around
&mut ChainWorkerState
, and simply represent the "lifetime" of a transaction of changes to the chain state.Test Plan
After this PR and with #2163, the
test_open_chain_node_service
stopped failing. Without this PR, it fails consistently.Release Plan
This is an internal refactor, so nothing is needed.
Links