bnb-chain / op-geth

GNU Lesser General Public License v3.0
58 stars 47 forks source link

feat: sequencer auto recover when meet an unexpected shutdown #166

Closed krish-nr closed 2 days ago

krish-nr commented 2 months ago

Description

This PR aims to fix the issue where the sequencer node fails to recover after a crash (specifically when the sequencer is a PBSS node). related node PR

Rationale

When a sequencer node crashes, it may fail to persist the journal in time. As a result, when Geth is restarted, the journal data cannot be read, leading to the loss of recent state data. This causes the sequencer to fail during the buildPayload process, rendering it unable to continue operating. The diagram below illustrates the sequencer block production flow, with the red sections highlighting the logic that cannot proceed due to the crash.

In the diagram, sequencerAction alternates between stages (1) and (2). In stage (1), the payload is constructed, and in stage (2), the data is persisted. This process is not synchronously blocked; in stage (1), operations such as filling the payload's transactions (txs) and some other tasks are performed asynchronously, while the payload is returned synchronously and immediately, controlled by a condition lock (cond). Therefore, before the update in stage (1) is completed, the getPayload in stage (2) will be in a cond.wait state.

After the sequencer crashes, due to the loss of state data (for example, the sequencer crashes at block height 34123456 and the next block to be built is 34123457), the prepareWork phase depends on the state data of block 34123456. However, after the crash, the state data of block 34123456 is lost, leading to failure in this phase and consequently preventing the system from entering the update logic. As a result, the process will remain indefinitely blocked at getPayload, unable to make progress.

image

The fix involves adding two routines to handle the recovery process and monitor the recovery progress. Upon a failure in the generate phase, a fix routine is started based on specific error conditions. To avoid blocking the main process, a separate routine is also initiated to monitor the specific block being repaired. Once the data recovery is complete, a retry of the update process is triggered, allowing the system to recover the state from before the crash and continue making progress.

There are two scenarios for recovery: recovering from local data or from peers (for the sequencer, peers are its backup nodes). In most cases, the data can be recovered locally. However, there is a corner case where local recovery fails: if the sequencer has already gossiped the latest block to peer nodes but crashes during the local persistence process, the sequencer may fall behind the peer by one block. In this extreme situation, the sequencer must recover from the peer.

This is the complete sequencer recovery flow, as illustrated below.

image

Example

add an example CLI or API response...

Changes

Notable changes: