The Supervisor is unable to backfill data into the logsdb at a sufficient rate, and needs to be improved.
Sync Speed
Across multiple measurements, the speed when backfilling logs into the supervisor is very low
Spot check while attempting to recover the MVP Devnet was about 25 blocks per minute (during an early portion of sync)
Spot checking ArgoCD logs while writing this ticket:
17:58 - 67982 to 19:08 - 68224 = 70min for 242 blocks = 3.4 blocks per minute
Estimating overall rate based on the sync start time to now:
Nov 8 ~1:30pm restarted Supervisor to sync
Nov 12 ~12:30 at 68224
~4 days for ~68k blocks = 11.8 blocks per minute
This demonstrates that as we synchronize, we actually slow down over time. However, in all cases we are syncing so slowly that any natural network progression will outpace the Supervisor's backfill.
Theories and Hypotheses (Potential Solutions)
In investigating the FetchReceipt logic, it appears the Supervisor leverages the monorepo shared RPC Client with a Basic Receipt Fetcher. I suspect that this basic fetcher is not designed for heavy fetching like the supervisor is doing. The supervisor also has minimal heuristics around fetching techniques, and just tries to backfill block-by-block. I would guess that this naive behavior is causing backoffs and throttles
Fixes:
The Supervisor should be given more sophisticated backfilling techniques, including batch calls when missing large gaps. Batch calls alone would probably fix this issue, as 10xs of blocks worth of receipts could be fetched per API call.
The Supervisor could finally have multiple Execution Endpoints to call to further spread the request load
Testing
We'll have to try purging a local devnet DB after a long period of activity to observe the RPC behavior from the supervisor. This issue may not present in local environments due to local infrastructure making network calls cheaper.
Priority
Critical for a stable devnet. Not blocking local devnets or other testing.
Summary
The Supervisor is unable to backfill data into the
logsdb
at a sufficient rate, and needs to be improved.Sync Speed
Across multiple measurements, the speed when backfilling logs into the supervisor is very low
Spot check while attempting to recover the MVP Devnet was about 25 blocks per minute (during an early portion of sync)
Spot checking ArgoCD logs while writing this ticket:
17:58 - 67982
to19:08 - 68224
=70min for 242 blocks
= 3.4 blocks per minuteEstimating overall rate based on the sync start time to now:
Nov 8 ~1:30pm restarted Supervisor to sync
Nov 12 ~12:30 at 68224
~4 days for ~68k blocks
= 11.8 blocks per minuteThis demonstrates that as we synchronize, we actually slow down over time. However, in all cases we are syncing so slowly that any natural network progression will outpace the Supervisor's backfill.
Theories and Hypotheses (Potential Solutions)
In investigating the
FetchReceipt
logic, it appears the Supervisor leverages the monorepo shared RPC Client with a Basic Receipt Fetcher. I suspect that this basic fetcher is not designed for heavy fetching like the supervisor is doing. The supervisor also has minimal heuristics around fetching techniques, and just tries to backfill block-by-block. I would guess that this naive behavior is causing backoffs and throttlesFixes:
Testing
We'll have to try purging a local devnet DB after a long period of activity to observe the RPC behavior from the supervisor. This issue may not present in local environments due to local infrastructure making network calls cheaper.
Priority
Critical for a stable devnet. Not blocking local devnets or other testing.