consensus-shipyard / ipc

🌳 Spawn multi-level trees of customized, scalable, EVM-compatible networks with IPC. L2++ powered by FVM, Wasm, libp2p, IPFS/IPLD, and CometBFT.
https://ipc.space
Apache License 2.0
40 stars 32 forks source link

Wait to serve ABCI until committed parent finality is loaded and cache is pre-filled #863

Open raulk opened 5 months ago

raulk commented 5 months ago

Problem

Fendermint exposes its ABCI immediately after starting. However, some services are not fully started or ready by then (they're initializing in the background). Concretely, the one that's proven problematic the Parent Finality Syncer.

There are several race conditions possible:

  1. The node is starting from a WAL that contains a block proposal with a parent finality message. Because CometBFT quickly fires calls against Fendermint when starting from the WAL (no network latencies or p2p involved, nothing to be fetched over the network), the last_committed_finality will still be None when we hit check_height(), leading to a rejection of the proposal.

  2. Similarly, if we receive a block over p2p with some parent finality within the acceptable range (max_proposal_range), we will reject it despite being probably correct.

Proposed solution

Await until the initial state of the Parent Finality Syncer is populated, including as many parent blocks as max_proposal_range, so we're in a condition to evaluate proposed finality upon start. Note that we've had no chance to broadcast our parent view to the network, so it's fair that we prepare ourselves to accept what's reasonable.

This may introduce an upfront startup latency, but this should be alleviated once we introduce a durable cache.

Other issues to evaluate

I think there's a chance that our need to have filled the cache with every finality we expect to see may break long-range snapshot-less catch-ups. For example, if we are catching up 2000 subnet blocks, we will fail this condition quickly if our cache is not ready.

Long-term

As IPC gets more complex and packages more components/services, we should look into more mature service lifecycle and dependency management options.

linear[bot] commented 5 months ago

ENG-841 Wait to serve ABCI until committed parent finality is loaded and cache is pre-filled