EVM Storage: Baseline test

aakoshh commented 2 years ago

Part of https://github.com/filecoin-project/ref-fvm/issues/859 Depends on https://github.com/filecoin-project/ref-fvm/issues/887

Create a new module under builtin-actors/actors/evm/tests that runs a couple of scenarios with the test contract and captures the statistics about storage access.

Ultimately gas for storage access is charged in the DefaultKernel, which are called through the Wasm runtime using the methods in sys::ipld, which in turn is called by the ActorBlockstore. However in this repo tests are using a MockRuntime with a MemoryBlockstore that doesn't track costs.

We have a choice of how we run the test:

Add a dev-dependency on ref-fvm and call the contract through Wasm to run all the gas cost machinery.
Add the option of storage access tracking to the memory store used by the MockRuntime. This won't track the gas cost explicitly, but they will still be highly correlated.

The second option seems more straight forward and easier to implement. We need to track the following: 1) the number of times get/put is called in the Blockstore and 2) the total number of bytes read/written. These both things we aim to improve by tweaking the HAMT store.

The general idea in each test is to invoke a method of the test contract multiple times, capture the cost of each invocation, and write them to a jsonline named after the test, e.g. <scenario-name>.jsonline with content like {i: 100, get_count: 12, get_bytes: 12345, put_count: 1, put_bytes: 2345} which we can then feed to gnuplot for visualisation. We might want to add some approximate gas cost to represent everything as a single number based on the price lists.

The test should exercise the following scenarios (free to add more):

Append to a single item to the dynamic array, invoke N times.
Append M items to the dynamic array, invoke N times.
Read M consecutive items from the array, invoke N times.
Add a single item to the mapping, invoke N times.
Increment the counter times, invoke N times.
Add 256 items to the mapping to saturate the root node then increment the counter N times.

These are tests based on hypotheses about potential improvements we can make to the HAMT.

anorth commented 2 years ago

If this is about testing code in the EVM actor, this issue probably belongs in the builtin-actors repo alongside that code. Shall I transfer it?

Please do not add a dependency from there on ref-fvm. The VM implementation should be abstracted from the actors. If you were instead trying to benchmark performance of code that's actually part of the VM, rather than an actor, then I'd suggest implementing an actor to specifically exercise whatever you're after within the ref-fvm repo itself. But I think that's not what this is about.

You could add stats to the mock runtime, and that might be sufficient here.
Consider instead adding them to the integration test runtime, which will be useful in a wider range of situations, including calls between actors. The Go version from which this was ported has some metrics, but for reasons on which I am ignorant, that wasn't ported to the Rust version. I recommend porting these stats from Go.
Alternatively, if you're just profiling a single data structure, you don't even need to use any runtime. Just exercise the datastructure directly from some unit tests, and use a fake Blockstore that you instrument yourself. This would be the simplest (least code) approach, but a bit more limited in applicability.

FYI @ZenGround0, who also has experience optimising HAMTs.

raulk commented 2 years ago

@anorth no, please don’t transfer. We are tracking EVM issues in this repo for simplicity in workflows of the FVM team.

aakoshh commented 2 years ago

Thanks for the pointers @anorth I will have a look. I agree that for this we shouldn't have to pull in a dependency on ref-fvm. And moving an EVM specific test like this there is equally unappealing.

I think it's worth running the EVM in this test to actually capture what it does because the semantics we want to optimise for are actually that of the Solidity compiler. We might not even want to use a HAMT, it's just been realised by the team that it's a poor fit, but maybe not beyond redemption.

Regarding the transfer to the builtin-actors: I actually did that as well, then brought it back after asking @maciejwitowski because the labels he recommended aren't available there. He said they would prefer the ticket here, easier to manage in one place. I don't mind either way. There are quite a number of EVM related issues besides these already in ref-fvm 🤷

raulk commented 2 years ago

@anorth The goal here is to measure overall gas costs of these data structures. Blockstore costs are only a part of that; execution cost is only measurable by compiling to Wasm bytecode and running it in a Wasm runtime, hence the need to use the integration test framework and not an abstraction. In fact, the measurements we need to get here are sensitive to the Rust compiler version, the Wasmtime version, compiler flags, etc.

What's wrong with adding a dev dependency on ref-fvm from builtin-actors? I would imagine that a critical part of the testing flow of builtin-actors (present and eventual) is to run high-fidelity benchmarks on CI (this is a weakness in our dev workflow right now), smoke test deploying actors a true Wasm environment (sometimes introducing a wrong dependency would cause imports that are incompatible with the FVM, e.g. wasm bindgen, happend a few weeks ago), etc.

The alternative is to place these integration tests in a third repo, but it just feels like the perfect recipe for things to fall out of sync, and to add cognitive overhead and code complexity.

anorth commented 2 years ago

We are tracking EVM issues in this repo for simplicity in workflows of the FVM team.

I disagree and commit. We can probably work with that, but please make an effort to actively publish discussion about the commons that is the built-in actors repo and its tooling to other actor developers. This is a perfect example.

What's wrong with adding a dev dependency on ref-fvm from builtin-actors?

I ack the discussion about benchmarking the builtin actors generally (and it sets an example for all others). I'm not immovable on it but we should have a serious discussion about options and tradeoffs. I'm going to move that general discussion to builtin-actors https://github.com/filecoin-project/builtin-actors/issues/678 for better visibility.

raulk commented 2 years ago

We can probably work with that, but please make an effort to actively publish discussion about the commons that is the built-in actors repo and its tooling to other actor developers.

Deal.

I'm going to move that general discussion to builtin-actors https://github.com/filecoin-project/builtin-actors/issues/678 for better visibility.

👍 👀 Thanks for jumping on this.

aakoshh commented 2 years ago

FWIW I think it's the gas that approximates the cost of these real underlying variables (data volume and number of blocks accessed), so for this test, it's more meaningful to measure them directly. How much gas to attribute to them would be based on measuring the operations against wall clock time to figure out appropriate coefficients. If we have the price list play a role in these tests, misconfiguration or changes to the coefficients might distort the results or just make them more difficult to reason about.

I would leave involving ref-fvm to where we want to relate gas to time on a standardised machine setup.

filecoin-project / ref-fvm

EVM Storage: Baseline test #888