[stateless_validation] Missing ChunkExtra on load memtrie on startup

staffik commented 2 months ago

Error message after restarting a stateless validation node. After restart it attempts to load memtrie on startup (shard shuffling enabled):

2024-04-22T21:15:58.252570Z  INFO memtrie: Loading trie to memory for shard s0.v2...
2024-04-22T21:15:58.252573Z DEBUG memtrie: Loading base trie from flat state... shard_uid=s0.v2
thread 'main' panicked at chain/client/src/client_actor.rs:222:6:
called `Result::unwrap()` on an `Err` value: Chain(StorageError(StorageInconsistentState("No ChunkExtra for block FJR59St3DjDVR4xvdUa8Mhf5JSoBATAqr2gkYaTBdGHR in shard s0.v2")))
stack backtrace:
   0: rust_begin_unwind
   1: core::panicking::panic_fmt
   2: core::result::unwrap_failed
   3: near_client::client_actor::start_client
   4: nearcore::start_with_config_and_synchronization
   5: neard::cli::RunCmd::run::{{closure}}
   6: tokio::task::local::LocalSet::run_until::{{closure}}
   7: neard::cli::NeardCmd::parse_and_run
   8: neard::main
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

I think the solution would be to modify load_memtries_on_startup() so that it can take state_root as load_mem_trie_on_catchup() does: https://github.com/near/nearcore/blob/16e23210eeb81804c9ecfbf9c6455f3f7601497f/core/store/src/trie/shard_tries.rs#L430 Example usage: https://github.com/near/nearcore/pull/10820/files#diff-ef9c6aaa80a330e446c5365f42be9bff37ba4f898cf519dadd7e17545783c77cR2787

staffik commented 2 months ago

So there is a dependency on ChunkExtra which for some reason is not available there. But we only need it to get state_root which can be obtained without ChunkExtra.

robin-near commented 2 months ago

Hmm, so the startup loading logic is a bit different from the catchup loading logic because whereas during catchup we have just created flat storage from a downloaded trie, during startup the flat storage may be in any arbitrary state. The flat head is somewhere, and on top of the flat head there is any set of deltas that represent different forks we may still be choosing among in the future. When loading memtrie, we start with the flat head and then for each delta, we also construct a new memtrie root to represent the difference. So, we cannot just take a different state root for the flat head, as that may not be consistent with the state that the flat state represents, and we cannot just take some other state root that does not correspond to the flat head, because then we may be missing the state root for some fork that we end up building on.

So for example, suppose we have blocks A, B, C, D where B.parent == A, C.parent == A, D.parent == C. The flat head may be at A. We would need to load four state roots corresponding to the post state roots of A, B, C, and D, because technically we may continue building from any of these blocks. The memtrie would contain four roots, and if we apply a chunk on top of B for example, we would query the memtrie root corresponding to B.

ChunkExtra is the place where we obtain the state root, because the flat state encodes the state corresponding to the post state root of the flat head, which is stored in the ChunkExtra for the flat head. Similarly, for each flat state delta, the delta describes the state transition whose result corresponds to the post state root of the block that the flat state delta is intended for.

As for why ChunkExtra is missing, that's still to be investigated.

robin-near commented 2 months ago

Ah ok, I think the bug is here. It's a problem that I deferred during the implementation of memtries that I honestly just forgot about. https://github.com/near/nearcore/blob/16e23210eeb81804c9ecfbf9c6455f3f7601497f/chain/chain/src/chain.rs#L516

When we load the memtries, we needed to determine which shards memtries should actually load. At that time, I simply took the tip of the blockchain, because, well, we tracked all shards anyway so that would only change upon resharding. But now, that also changes from epoch to epoch due to single shard tracking.

So the bug was triggered as follows. The node is on a tip at height (presumably) 114912712, which is the first block of a new epoch, where it is a chunk producer for shard 2. When loading memtries, it needed to start from the flat head, which is 114912710, but that is in the previous epoch, where it was not tracking shard 2. So, when querying for the ChunkExtra for shard 2 at the 114912710 height, it didn't exist.

So then, I have some questions:

Isn't state sync supposed to catch up on shard 2, and therefore the ChunkExtra for shard 2 should exist in the previous epoch? Does this mean state sync failed during the previous epoch?
Loading the memtrie based on the tip is not correct. What we want is to load all shards for which we may need memtries for. If we're in an epoch boundary, technically we may still need the memtries from the previous epoch because it's not guaranteed that this new epoch is the one we're going to go into (we may have a fork that results in a different epoch - albeit having the same memtrie requirements in that epoch). If we arrived at the epoch while the node is up, we would have two memtries loaded, one from the previous epoch that was not unloaded yet, and one loaded by state sync during catchup earlier. So, when starting up, we need to restore the exact same state. What exactly should we do here to match that same state? This seems tricky... @staffik what do you think?
Perhaps if we solve the above problem, the state sync problem would also be resolved, because we should probably be trying state sync again before attempting to load the memtries for the supposed-to-be-caught-up shard.

staffik commented 2 months ago

Isn't state sync supposed to catch up on shard 2, and therefore the ChunkExtra for shard 2 should exist in the previous epoch? Does this mean state sync failed during the previous epoch?

The issue happened in the middle of the epoch, so memtrie has already been loaded (in the previous epoch, on catchup), state sync worked fine in the previous epoch. Loading memtrie on catchup does not require ChunkExtra, so it might have not been available in the previous epoch yet things worked.

staffik commented 2 months ago

Loading the memtrie based on the tip is not correct. What we want is to load all shards for which we may need memtries for. If we're in an epoch boundary, technically we may still need the memtries from the previous epoch because it's not guaranteed that this new epoch is the one we're going to go into (we may have a fork that results in a different epoch - albeit having the same memtrie requirements in that epoch). If we arrived at the epoch while the node is up, we would have two memtries loaded, one from the previous epoch that was not unloaded yet, and one loaded by state sync during catchup earlier. So, when starting up, we need to restore the exact same state. What exactly should we do here to match that same state? This seems tricky... @staffik what do you think?

We need flat state to construct memtrie and we want memtrie for each flat state root. So if we just maintain bijection: "flat state roots" - "memtries", we should be fine?

we should probably be trying state sync again before attempting to load the memtries for the supposed-to-be-caught-up shard

We would be good as long as flat state is correct. If we need state sync again because of memtrie - would it mean that some flat state is missing, and need to be state synced too?

staffik commented 2 months ago

@robin-near Do you think tracing would be useful in identifying why ChunkExtra was missing? I am currently not using it because had many merge conflicts with current master: https://github.com/near/nearcore/pull/10843

robin-near commented 2 months ago

@robin-near Do you think tracing would be useful in identifying why ChunkExtra was missing? I am currently not using it because had many merge conflicts with current master: #10843

Ah forgot to update on this; only https://github.com/robin-near/nearcore/commit/b85ed54ac03f1184c47df197631dea969c5f479c is needed now for tracing.

robin-near commented 2 months ago

Does catchup not write ChunkExtra? Maybe that's where my confusion is.

staffik commented 2 months ago

Ah, yes. After loading memtrie, we write ChunkExtras in the loop here: https://github.com/near/nearcore/blob/f95087bc508eda09ceecd7b8cb4a113b109cef7f/chain/chain/src/chain_update.rs#L863

telezhnaya commented 2 weeks ago

It's reproducible again https://github.com/near/stakewars-iv/issues/139

near / nearcore

[stateless_validation] Missing ChunkExtra on load memtrie on startup #11135