paradigmxyz / reth

Modular, contributor-friendly and blazing-fast implementation of the Ethereum protocol, in Rust
https://reth.rs/
Apache License 2.0
3.5k stars 896 forks source link

Blockchain Tree Service #7154

Open rkrasiuk opened 3 months ago

rkrasiuk commented 3 months ago

Issues

  1. Tree State is behind the rw lock https://github.com/paradigmxyz/reth/blob/9312424db05b066cee6fae40471dc679ed35cef8/crates/blockchain-tree/src/shareable.rs#L29-L32

    • Any meaningful tree operation requires a write lock
    • Any rpc query on the pending state might be waiting for the write lock to be lifted
  2. Connecting buffered blocks is a recursive operation in a non-obvious case https://github.com/paradigmxyz/reth/blob/9312424db05b066cee6fae40471dc679ed35cef8/crates/blockchain-tree/src/blockchain_tree.rs#L828

In the worst case, this would connect up to 3 epochs worth of blocks. This might be the root cause of https://github.com/paradigmxyz/reth/issues/6905

  1. Cumbersome state tracking Any pending or side chain in the blockchain tree has an in-memory state attached to it. Over the last months, we've had multiple cases of state corruption (https://github.com/paradigmxyz/reth/pull/5683, https://github.com/paradigmxyz/reth/pull/5921, https://github.com/paradigmxyz/reth/pull/6821) due to mishandling of in-memory state after canonical state update.

Solution

Blockchain Tree should be rewritten into a message-passing service. The engine should send control commands to the tree service. The tree service should orchestrate how to prioritize the received actions.

The tree service interface should be async.

struct BlockchainTreeHandle {
  to_tree: UnboundedSender<TreeMessage>,
} 

struct BlockchainTreeService {
  messages_rx: UnboundedReceiverStream<TreeMessage>,
  task_spawner: Arc<dyn TaskSpawner>, // for concurrent payload processing
}

// impl Future for BlockchainTreeService { ... } 

Downgrade the lock to wrap only the actual tree state. Double-check if the lock is necessary at all.

Tree Action Prioritization:

We might need to move the ancestor check to the tree. This information should be included in the response that the tree sends back to the engine.

Testing

All existing tests should be ported over to the new service.

We should have chain generation test utils for easier testing of the tree .

github-actions[bot] commented 2 months ago

This issue is stale because it has been open for 21 days with no activity.