linera-io / linera-protocol

Main repository for the Linera protocol
Apache License 2.0
112 stars 94 forks source link

Fix handling of chain state changes in `ChainWorkerState` #2180

Closed jvff closed 1 week ago

jvff commented 1 week ago

Motivation

2163 makes the ChainWorkerActor be reused across requests to the WorkerState. This improves performance because the ChainStateView does not need to be reloaded for every request. However, the PR was causing the test_open_chain_node_service test would fail, and the normal logs would show that the validators rejected a block because messages were out of order.

The trace log however, would show that first the validators would reject a block because it was missing the incoming messages (this is normal). After the cross-chain messages had been propagated, reproposing the block failed because the validator complained that those messages were already received.

The cause was that the chain state was not being properly discarded on error conditions. The ChainWorkerActor owns a ChainWorkerState, which has the ChainStateView instance. When the ChainWorkerState returns an error, it did not rollback the changes to the ChainStateView, so they were kept in memory. When the next request was handled, the state would start off with invalid data.

Proposal

Force rollback to be done automatically with some helper types. ChainWorkerStateWithTemporaryChanges always rolls back all changes when it is dropped. ChainWorkerStateWithAttemptedChanges has a save method, which persists the changes. If the changes aren't persisted (e.g., when an error occurs) then the ChainStateView has its changes rolled back when the wrapper type is dropped.

Both types are wrappers around &mut ChainWorkerState, and simply represent the "lifetime" of a transaction of changes to the chain state.

Test Plan

After this PR and with #2163, the test_open_chain_node_service stopped failing. Without this PR, it fails consistently.

Release Plan

This is an internal refactor, so nothing is needed.

Links