bluealloy / revm

Rust implementation of the Ethereum Virtual Machine.
https://bluealloy.github.io/revm/
MIT License
1.65k stars 554 forks source link

feat: Introduce ByteCode format #121

Closed rakita closed 2 years ago

rakita commented 2 years ago

With a restriction that contacts can not start with 0xEF and with the possibility that in future we will have different ByteCode for EVM Object Format, it makes sense to introduce support for it into the revm. https://eips.ethereum.org/EIPS/eip-3541

The biggest benefit atm would be introducing ByteCode type that will contain jumptable and gas blocks and with that we would skip the needed analysis of the contract. This is a fairly big speed-u for some use cases noticed in the foundry as fuzzing tests call a lot of contracts with not much execution and analysis, in that case, is taking a lot of time.

The proposal is to introduce enum ByteCode that would somehow return: code, code_size and jumptable.

Basically replace this input with ByteCode: https://github.com/bluealloy/revm/blob/08528247759bcd113c0c5385ee081e20e11cb80b/crates/revm/src/interpreter/contract.rs#L60 and propagate change wherever needed. ByteCode should set these fields in the contract:

and do the same for database struct:

To fully circle the story with EVM Object Format, CREATE and CREATE2 will need to return 0xEF type format in bytes that would get transformed to ByteCode for evm usage here:
https://github.com/bluealloy/revm/blob/08528247759bcd113c0c5385ee081e20e11cb80b/crates/revm/src/evm_impl.rs#L395-L399 But this is a consensus rule and out of scope for this feat, we should just continue using legacy ByteCode here. If needed we could in Inspector create_end transform that into ByteCode.

gakonst commented 2 years ago

cc @onbjerg @brockelmore for when we'll want to further speedup fuzz tests

onbjerg commented 2 years ago

We would need a utility in either Forge or ethers-rs to convert from the old format to the new format given that the Solidity compiler does not support EOF yet

rakita commented 2 years ago

@onbjerg shoudn't utility be inside revm?

What i thought is to create/create2 use

ByteCode::Legacy {
  bytes: Bytes, 
} 

And you would call something like ByteCode:Analyse(bytes) inside create_end that would replace bytecode in subroutine with

Bytecode::Analysed {
  code:
  size:
  analysis: 
}

Or just directly use it inside the db.

onbjerg commented 2 years ago

Makes sense, however I'm not sure what the gains are for Forge here, since we run each test in a database we later discard. The main benefactor of an optimization here would be fuzz tests, but since we discard the database and cannot be sure that the same bytecode will be at the same address in the next run, we can't really change that behavior - is there any way we can have a performant-ish bytecode cache based on the bytecode itself instead of addresses, that we could then reuse across test runs?

rakita commented 2 years ago

@onbjerg hm so contracts for fuzz tests are created at evm runtime. Can you run with me an outline of how fuzzing is done?

onbjerg commented 2 years ago

Setup phase (this state IS persisted):

Then, for each fuzz run we decide on some parameters using different strategies (not important for this outline) and we:

In most cases we will probably end up deploying the same contracts, however, we don't enforce that that is the case, so some users might conditionally deploy extra contracts based on the fuzz inputs

Does that make sense?

rakita commented 2 years ago

This would help only to preanalyse contracts deployed inside Setup phase, as you could save analysed contracts in db. But if the ratio of contracts made by fuzz tests vs setup stage is significant, this will not mean much.

rakita commented 2 years ago

I created three types of bytecode:

Raw and Checked are usually similar (checked is slightly better), while analysed bytecode execution is x1.5-2 faster.

Other then introducing Bytecode i added some more packed versions of jumptable and OpCode Info that got us a slight boost but nothing major. Fields for them became private but there should be alternative functions that can be called.

Just to visually compare it. First flamegraph is only raw bytecode, while the second flame graph is analysed bytecode. First (Raw elapsed time: 459.801385ms): flamegraph Second ( Analysed elapsed time: 261.872689ms): flamegraph

gakonst commented 2 years ago

Once the Bytecode is analyzed, can we somehow cache it across instantiations of the EVM? e.g. in Foundry we instantiate a new EVM each time, so it'd be nice if we can store the analyzed bytecode and initialize the EVM with the already analyzed one

rakita commented 2 years ago

Once the Bytecode is analyzed, can we somehow cache it across instantiations of the EVM? e.g. in Foundry we instantiate a new EVM each time, so it'd be nice if we can store the analyzed bytecode and initialize the EVM with the already analyzed one

Database trait now returns Bytecode https://github.com/bluealloy/revm/pull/156/files#diff-2955e45b5fcaef33495479dce9f6bd07c88fbb96b6f633a4b31e9362a12adedbR22 So it becomes the cache that you can move between new EVM instances.

Here is an example of how one analysed contract is called multiple times: https://github.com/bluealloy/revm/pull/156/files#diff-c28df8837c5c77a683c9974376c1b53605fe01730f1b81b25c38771d58ecda6bR52-R59

Does this work for you?

onbjerg commented 2 years ago

I think that it should work, yeah. An update of REVM in Foundry should probably provide a noticeable speedup for fuzz tests that deploy their contracts in setUp, but making it work for fuzz tests that deploy contracts in the fuzz test itself will need some more work

Will give it a shot soon and report back :smile:

onbjerg commented 2 years ago

Seems to have shaved off a second of solmate's testing time on my machine with no special handling added! (just replaced the Bytes stuff with Bytecode::new_raw where needed)

image

Left some comments on the PR, but this is great work :smile: Thank you!

(Note: I couldn't believe it so I recompiled master locally to make sure it wasn't my local Rust compiler being smart, but same results - using the new bytecode format is faster)

rakita commented 2 years ago

Okay, i didnt expect it to be this big of a difference. Can you do two flamegraph between executions to better see where execution went down?

And thank you for review, will make those changes before pr merge.

rakita commented 2 years ago

i needed to use --no-inline to make flamegraph finish and needed to set all debug=0 to debug=true inside foundry/Cargo.toml so that debug symbols are present. it is painful to build the foundry :) but i manage to create some graphs.

This is running solmate on old foundry: flamegraph1 As we can see call_inner-> revm::interpreter::contract::Contract::new takes a lot of time to process

sidenote: Additional improvement can potentially be done inside insert_account_info as it takes a lot of time to do hash of bytecode. We can probably precompute that. But this is the task for the future.

Second flamegraph is with the new revm and it does not have any Contact::new section. I tried to inline it and change code calls for it to show up but i got nothing flamegraph

I did optimize to_analysis to be better and more compact but results on the real contract test are amazing.

@onbjerg instead of new_raw(bytes) you could use new_raw(bytes).to_checked() this will speed it up a little bit and it is a small change.

onbjerg commented 2 years ago

@rakita For sure! Thanks for the pointer. Are there any other changes you see having some impact we can do? I noticed you changed some stuff about the OpInfo struct too, so maybe there's something we can use there in coverage or the debugger?

In some places (the debugger and coverage) we also build our own PC -> IC maps and it's not really ideal. Do you think it is possible to reuse the analysis made by REVM for this purpose as well? For context, source maps in Solidity use instruction counters (e.g. the offset of the instruction in the bytecode minus any push bytes), but obviously during execution we get the program counter instead.

rakita commented 2 years ago

I just transformed internal representation of OpInfo so it is memory aligned to u32, it has same functionality as before it is just a little bit more packed, it is packed as: IS_JUMP (1bit) | IS_GAS_BLOCK_END (1bit) | IS_PUSH (1bit) | gas (29bits)

Bytecode::Analysed only contains JUMPDEST and gas block it does not contain where PUSH'es are. You can probably check the code in revm and copy parts of it. And don use Maps here: https://github.com/foundry-rs/foundry/blob/8016f26e160cb574c8c4f6e97d76e1386f956902/evm/src/coverage/visitor.rs#L473 use vector if you can

rakita commented 2 years ago

Just for comparison, on old version of revm 1.7 on same test, i am getting around ~570ms. That is boost of around ~10% for raw and ~20% for checked bytecode.

Foundry uses inspector and OpInfo, there is probably some more performance gains there and it still depends a lot on what contract is analysed. Didnt expect solmate test to be so fast.

It would be fun to see how we can integrate Analysed bytecode, it should be even faster than Checked. On create i am calling to_analysed, now it makes sense for me where the speedup is.

gakonst commented 2 years ago

gg