Add benchmarking utility for smart contracts

ryanc-bs commented 2 months ago

Describe the feature

When designing smart contracts, typically it is important to optimise the gas usage of the more complex methods. Hardhat provides a useful test utility for functional testing, but it would be great to have a separate benchmark utility - similar to how the Go language provides a separate benchmark utility accessible via go test -bench=.

There already exists a plugin for hardhat to show the gas used during a test but this has several limitations:

Only state-altering transactions are reported, meaning it cannot be used to benchmark pure or view functions.
Non-benchmark testcases are included unless filtered out explicitly (e.g. via --grep benchmark)
It shows the min, max and average gas used but no indication of other useful statistics like the variance or median.
It is not straightforward to separate benchmarks from functional tests - ideally benchmarks should only be run in a benchmarking mode and not as part of a regular test run (i.e. npx hardhat test with no extra args should not run benchmark tests)

In addition, it would be useful if hardhat could provide useful benchmarking utilities such as providing randomised inputs and adjusting the number of benchmarking iterations dynamically based on the elapsed time. Also ideally it could give the option to dump the gas usage data to a CSV or other output file as well as reporting to the console.

Search terms

benchmark gas tool testing

ryanc-bs commented 2 months ago

Additionally, I'm not sure if hardhat supports this currently or not but it would be useful to be able to pass an exclude pattern analagous to the --grep include pattern when running tests.

E.g. if I run npx hardhat test --exclude benchmark it should not run any tests with "benchmark" in the name.

As mentioned above a separate command/utility for running benchmarks would be ideal but in the short term this could help as a workaround.

cgewecke commented 2 months ago

Hi @ryanc-bs, 👋 I help maintain the hardhat-gas-reporter plugin which Hardhat uses to profile contract code. Just leaving some notes here about how you might accomplish some of the things you mention in lieu of a dedicated feature:

Measuring pure and view functions

The latest major version of the gas-reporter (v2) supports this. You can install it by running:

npm install hardhat-gas-reporter@latest --save-dev

...and turn on pure/view measurement by configuring the reporter as below:

// hardhat.config.ts
const config: HardhatUserConfig = {
  gasReporter: {
    reportPureAndViewMethods: true,
  }
}

Grepping tests

Mocha (hardhat's default test runner) has a grep option you can configure in the hardhat config. (This can be toggled on and off using environment variables):

Example

// Tag in a mocha test description...
it("is a gas usage simulation [ @benchmark ]", async function(){
 ...
})

// hardhat.config.ts
const config: HardhatUserConfig = {
  mocha: {
    grep: "@benchmark", // Find everything with this tag
    invert: true        // Run the grep's inverse set (e.g 'exclude')
  }
}

Post-processing Gas Data / Other formats

You can dump all the data collected by the gas-reporter asjson and generate / render your own report by setting the options below:

const config: HardhatUserConfig = {
  gasReporter: {
    outputJSON: true,
    outputJSONFile: "benchmarks.json",
  }
}

This exposes everything you'd need to run more complex analyses. Some report generation examples you could use as a base for your own rendering script are:

the gas-reporter's renderers
OpenZeppelin's github actions PR comparison report

Randomised Inputs

You might look into the fast-check package for this - it's a property-based testing harness with a good API for random value generation and it's agnostic about the test-framework (e.g can be integrated with mocha).

ryanc-bs commented 2 months ago

Thanks very much for the detailed response. I was not aware of the support for profiling pure functions, the inverse grep option or the ability to output gas data to JSON - I will try out these things!

I still think a separate benchmark mode with cases that are clearly differentiated from the functional tests would be nice, but I suppose that would be more of a feature request for Mocha rather than Hardhat.

cgewecke commented 2 months ago

I still think a separate benchmark mode with cases that are clearly differentiated from the functional tests would be nice

Agree! Think that's a great idea. It might make sense as part of a dedicated fuzzing feature here as well.

NomicFoundation / hardhat