Summary

The Supervisor is unable to backfill data into the logsdb at a sufficient rate, and needs to be improved.

Sync Speed

Across multiple measurements, the speed when backfilling logs into the supervisor is very low

Spot check while attempting to recover the MVP Devnet was about 25 blocks per minute (during an early portion of sync)

Spot checking ArgoCD logs while writing this ticket:

17:58 - 67982 to 19:08 - 68224 = 70min for 242 blocks = 3.4 blocks per minute

Estimating overall rate based on the sync start time to now:

Nov 8 ~1:30pm restarted Supervisor to sync
Nov 12 ~12:30 at 68224
~4 days for ~68k blocks = 11.8 blocks per minute

This demonstrates that as we synchronize, we actually slow down over time. However, in all cases we are syncing so slowly that any natural network progression will outpace the Supervisor's backfill.

Theories and Hypotheses (Potential Solutions)

In investigating the FetchReceipt logic, it appears the Supervisor leverages the monorepo shared RPC Client with a Basic Receipt Fetcher. I suspect that this basic fetcher is not designed for heavy fetching like the supervisor is doing. The supervisor also has minimal heuristics around fetching techniques, and just tries to backfill block-by-block. I would guess that this naive behavior is causing backoffs and throttles

Fixes:

The Supervisor should be given more sophisticated backfilling techniques, including batch calls when missing large gaps. Batch calls alone would probably fix this issue, as 10xs of blocks worth of receipts could be fetched per API call.
The Supervisor could also use multi-threaded worker pools to speed up requests. Here's an old PR which used exactly these techniques for a downloader which fetches receipt data: https://github.com/ethereum-optimism/optimism/commit/5a2ac1b4d3002d530082788b18eec15ecce243d5
The Supervisor could finally have multiple Execution Endpoints to call to further spread the request load

Testing

We'll have to try purging a local devnet DB after a long period of activity to observe the RPC behavior from the supervisor. This issue may not present in local environments due to local infrastructure making network calls cheaper.

Priority

Critical for a stable devnet. Not blocking local devnets or other testing.

ethereum-optimism / optimism

interop: Supervisor Sync Speed is Insufficient #12903

Summary

Sync Speed

Theories and Hypotheses (Potential Solutions)

Testing

Priority