perf: `anvil_mine` is much slower than `hardhat_mine`

fp-crypto commented 1 year ago

I attempted to move my testing from hardhat node to anvil. Some of my testing requires mining a large number of blocks. anvil_mine execution time seems to scale linearly relative to block count, whereas hardhat_mine with hardhat node is nearly constant. See the following benchmarks:

Anvil

In [1]: import timeit

In [2]: [timeit.timeit(stmt=(f'chain.mine({i})'), globals=globals(), number=50) for i in [1, 1
   ...: 0, 100, 1000]]
Out[2]:
[0.07101033299113624,
 0.39050891700026114,
 6.034164124997915,
 276.3122650830046]

Hardhat

In [1]: import timeit

In [2]: [timeit.timeit(stmt=(f'chain.mine({i})'), globals=globals(), number=50) for i in [1, 1
   ...: 0, 100, 1000]]
Out[2]:
[0.10906625000643544,
 0.09370395800215192,
 0.0902477499912493,
 0.08129970800655428]

I'm using:

anvil 0.1.0 (0e33b3e 2023-07-26T00:26:08.161934000Z)
hardhat 9.6.7
ape 0.6.14 (for testing)

https://github.com/foundry-rs/foundry/blob/41bae8e6265e905f73c3f4eac14a5ba9275417a4/anvil/src/eth/api.rs#L1421-L1440

fvictorio commented 1 year ago

The reason Hardhat's time is constant is that when more than N blocks are mined (I don't remember the value of N) they are not actually mined. Instead, a fake range of blocks is created and then the last ones are actually mined. This means that the hash of the latest block is kind of fake, but that's an acceptable trade-off.

There are some non-obvious things to keep in mind to implement this approach though. Happy to share them here if someone is interested in implementing this in Anvil.

janjakubnanista commented 8 months ago

What would be the way to go about implementing this @fvictorio?

orph3usLyre commented 1 month ago

Hi @fvictorio,

Following up on this issue since there hasn't been any activity for a while. Would it be possible to discuss implementing this functionality, either here or via email? We're currently using anvil extensively for testing and this functionality would save us a lot of CI time. I'm more than willing to tackle this issue if pointed in the right direction.

Cheers :tada:

fvictorio commented 1 week ago

Ok, this is mainly off the top of my head, so take it with a grain of salt.

The main thing here is that you obviously don't want to mine one million blocks or whatever, so what you do is to create a representation of a range of blocks that doesn't actually exist. Doing this means that the latest block won't be 100% correct (for example, you can't have a correct block hash without actually mining all the previous blocks), but that's fine for most people.

This is, broadly speaking, how Hardhat/EDR does it:

You get a request to mine N blocks
The first thing you do is to mine blocks until the mempool doesn't have pending transactions (that is, txs that can be included in the next block; there could be non-minable transactions in the mempool though), or until you mined N blocks. This will mine M blocks (where M can be 0 if the mempool doesn't have any txs that can be mined).
If N == M, then you just finish. Otherwise, you subtract M from N.
Now if N is below a certain threshold (6 in EDR), you just mine those blocks sequentially.
If N is above the threshold, you can make a "reservation" for that range:
- First you mine one block more. Since the mempool doesn't have pending transactions, this block will be empty. We do this to simplify some of the logic that comes after it.
- Then you create a reservation of N-2 blocks. This is a structure representing all the blocks in that range that weren't actually mined.
- Finally, you mine one empty block more. This makes the latest block to be a real mined block, and not the end of the reservation.
When doing all of that, you need to get the timestamps right. If the block timestamp interval is 100 and you mine 1000 blocks, the new latest block has to have a timestamp of the previous latest timestamp plus 100_000.
If a block is fetched by number, and that number is within the reservation, you remove the reservation, create a block at that number, and create two new reservations before and after that block (or mine the blocks, if the "gap" is small enough).

We have a lot of tests for this, starting here: https://github.com/NomicFoundation/edr/blob/e0c927d337671cbba061caab7ad35db627ecf183/hardhat-tests/test/internal/hardhat-network/provider/modules/hardhat.ts#L303

You can also start reading EDR code from here for more details on how this works.

I'm pretty sure that there are corners that can be cut here, but we are normally quite obsessive about being as correct as possible (e.g., correct timestamps) and this is the way we managed to do it.

Sorry for the messy explanation and hopefully that helps!

foundry-rs / foundry

perf: `anvil_mine` is much slower than `hardhat_mine` #5499

Anvil

Hardhat