perf: streaming root - Githubissues

Description

Create a task that computes state root for new payload as we are executing the block. See https://github.com/paradigmxyz/reth/pull/10902 and https://github.com/paradigmxyz/reth/pull/11154 for experimental reference.

The most efficient way to achieve this is to separate IO lookups from main computation. This can be achieved by the following algorithm:

On each state update, retrieve the proofs of updated accounts and storage slots based on the state of the parent blocks.
On the batch of proofs retrieved, rehydrate the touched subtrie with new previously unseen proofs and update the root based on new leaf values. 2.1 Each subsequent update should be based on the updated subtrie.
When the state updates are finished, wait for the final subtrie update to be finished and return updated state root alongside

Note that the work in 1. and 2. can be done in parallel. We can update the trie with intermediate results once proofs have been retrieved.

Tasks

### Tasks
- [ ] https://github.com/paradigmxyz/reth/issues/10877
- [ ] https://github.com/paradigmxyz/reth/issues/11167

Requirements

Must compute a valid state root for the block
Must compute the state root faster than the regular sync computation
Must return valid TrieUpdates that match the ones computed via stack-based algorithm

Pre-requisites

State Update Streaming (https://github.com/paradigmxyz/reth/issues/10877)

In order to commence state root computation while the block is being executed, we need to stream state updates after each state transition. See linked issue for more details.

Productionizing `root_from_proofs`

Currently, we have an algorithm to compute state root from proofs in the witness.

https://github.com/paradigmxyz/reth/blob/6e64a14a6fdf23442c15d3c0a39501d013187f61/crates/trie/trie/src/witness.rs#L224-L281

This algorithm can be reused for implementing the streaming state proof.

Considerations

Extending `alloy_trie::HashBuilder` API

The current stack-based algorithm requires that for any iteration of the computation we must pre-sort the relevant branches and leaves we want to feed into HashBuilder. The only method for retrieving the result is HashBuilder::root. The absence of methods for retrieving intermediate results makes parallelizing work within a single trie more cumbersome.

The desired functionality for retrieving intermediate result can be achieved by (ab)using the proof API of the HashBuilder:

let state: HashedPostState = (); // updated state
let prefix: Nibbles = (); // the path to the node we are computing for

let matching_state = HashedPostState {
  accounts: state.accounts.into_iter().filter(|(hashed_address, _)| Nibbles::unpack(&hashed_address).starts_with(&prefix)).collect();
  storages: state.storages.into_iter().filter(|(hashed_address, _)| Nibbles::unpack(&hashed_address).starts_with(&prefix)).collect();
};

let mut hb = HashBuilder::default().with_proof_retainer(ProofRetainer::from_iter([prefix.clone()]));
// only add leaves and nodes from `matching_state`
let _root = hb.root(); // discard meaningless root
let proofs = hb.take_proofs();
let node = proofs.get(&prefix); // the node at target

However, it would be much better to have an explicit API for independently computing and retrieving the intermediate result.

Alternative To `alloy_trie::HashBuilder`

The main benefit of the stack-based algorithm is to alleviate the need to load entire updated state into memory. This makes it possible to compute the state root once after the first pipeline iteration when all state is considered updated from PoV of the trie. However, this benefit fades when the node is live syncing and all updated state is available in-memory anyway.

It is possible to implement the alternative algorithm for root computation during live sync, however, at the current stage extending is easier.

paradigmxyz / reth

perf: streaming root #11161

Description

Tasks

Requirements

Pre-requisites

State Update Streaming (https://github.com/paradigmxyz/reth/issues/10877)

Productionizing `root_from_proofs`

Considerations

Extending `alloy_trie::HashBuilder` API

Alternative To `alloy_trie::HashBuilder`

paradigmxyz / reth

perf: streaming root #11161

Description

Tasks

Requirements

Pre-requisites

State Update Streaming (https://github.com/paradigmxyz/reth/issues/10877)

Productionizing root_from_proofs

Considerations

Extending alloy_trie::HashBuilder API

Alternative To alloy_trie::HashBuilder

Productionizing `root_from_proofs`

Extending `alloy_trie::HashBuilder` API

Alternative To `alloy_trie::HashBuilder`