daimo-eth / daimo

Real world Ethereum
https://daimo.com
GNU General Public License v3.0
358 stars 29 forks source link

Fix ViemClient to support concurrent transactions #1266

Open dcposch opened 1 month ago

dcposch commented 1 month ago

Summary

Under load / many people trying to sign up or transact simultaneously, there's contention for the next nonce.

This has resulted in production downtime.

Proposed fix

Two options, not mutually exclusive:

In either case (= even with multiple EOAs each submitting at most one tx per block, no pending stacked nonces), we will sometimes have transactions fail without reverting during gas price spikes and sequencer issues.

In that case, we must retry the same nonce.

Open question: what is the exact condition where the sequencer returns replacement transaction underpriced ?

autoregressive commented 1 month ago

Nonce Issue I’m making big assumptions here as I’m not super familiar with your systems. However generally speaking you’re trying to solve the knapsack problem here. It’s likely only worth introducing parallelism (multiple eoas) if you can reduce complexity (e.g. splitting by txs which are not state dependent). It may be worthwhile exploring how block builders are implemented for inspiration https://github.com/flashbots/rbuilder

For context my team runs searchers across various evm chains.

Gas Issue

Your gas pricing is likely occurring due to two reasons:

  1. The node is returning the base fee for block n when you actually want the base fee for n+1. The next block gas fee is trivial to compute given the block header for block n.
  2. L1 fee miscalculation, most node implementations do not get this correct under heavy usage. It is non trivial to implement. https://docs.optimism.io/stack/transactions/fees

My team has largely solved this with custom node implementations.

dcposch commented 1 month ago

Curious why this requires a custom node implementation. Is this a forked L2 EL that provides RPC methods for accurate gas estimation?

autoregressive commented 1 month ago

Nonce/state You may also want to consider eth_sendRawTransactionConditional if you haven’t already. Not sure if it’s been rolled out yet on base.

Gas issue Most node implementations return for end of block n when you are targeting top of n+1.

You can tell if there estimation issues if you estimate gas or generate an access list, then define your gasLimit as a fixed ratio of estimate gas usage (e.g. gasUsed/0.9). Your total gas used should be 90% in that case and if it differs, something is wrong esp for something simple like transfers. This presumes you can consistently hit next block.

Why does next block matter? Because that’s the most relevant state you’re working with - your gas params should in theory be deterministic.

An experiment to run would be attempt to land transactions on specific blocks. You’ll find that you can’t do it consistently with node providers even if you bid crazy gas - it’s due to latency and/or bad l1 few estimates. The latency issue exacerbates everything.

In summary, what you’re getting is potentially incorrect but also behind out the gate and may be even further behind due to bad infra.

First step run your own nodes. Then either a custom rpc endpoint (heavy on infra costs though to scale) or the way we do it which is directly accessing state from the backend db the node writes to and load it into an evm implementation. You can achieve scale with the latter solution if architected correctly, as you won’t bog down your node.