NomicFoundation / hardhat

Hardhat is a development environment to compile, deploy, test, and debug your Ethereum software.
https://hardhat.org
Other
7.28k stars 1.4k forks source link

Add benchmarking utility for smart contracts #5688

Closed ryanc-bs closed 2 months ago

ryanc-bs commented 2 months ago

Describe the feature

When designing smart contracts, typically it is important to optimise the gas usage of the more complex methods. Hardhat provides a useful test utility for functional testing, but it would be great to have a separate benchmark utility - similar to how the Go language provides a separate benchmark utility accessible via go test -bench=.

There already exists a plugin for hardhat to show the gas used during a test but this has several limitations:

In addition, it would be useful if hardhat could provide useful benchmarking utilities such as providing randomised inputs and adjusting the number of benchmarking iterations dynamically based on the elapsed time. Also ideally it could give the option to dump the gas usage data to a CSV or other output file as well as reporting to the console.

Search terms

benchmark gas tool testing

ryanc-bs commented 2 months ago

Additionally, I'm not sure if hardhat supports this currently or not but it would be useful to be able to pass an exclude pattern analagous to the --grep include pattern when running tests.

E.g. if I run npx hardhat test --exclude benchmark it should not run any tests with "benchmark" in the name.

As mentioned above a separate command/utility for running benchmarks would be ideal but in the short term this could help as a workaround.

cgewecke commented 2 months ago

Hi @ryanc-bs, 👋 I help maintain the hardhat-gas-reporter plugin which Hardhat uses to profile contract code. Just leaving some notes here about how you might accomplish some of the things you mention in lieu of a dedicated feature:

Measuring pure and view functions

The latest major version of the gas-reporter (v2) supports this. You can install it by running:

npm install hardhat-gas-reporter@latest --save-dev

...and turn on pure/view measurement by configuring the reporter as below:

// hardhat.config.ts
const config: HardhatUserConfig = {
  gasReporter: {
    reportPureAndViewMethods: true,
  }
}

Grepping tests

Mocha (hardhat's default test runner) has a grep option you can configure in the hardhat config. (This can be toggled on and off using environment variables):

Example

// Tag in a mocha test description...
it("is a gas usage simulation [ @benchmark ]", async function(){
 ...
})
// hardhat.config.ts
const config: HardhatUserConfig = {
  mocha: {
    grep: "@benchmark", // Find everything with this tag
    invert: true        // Run the grep's inverse set (e.g 'exclude')
  }
}

Post-processing Gas Data / Other formats

You can dump all the data collected by the gas-reporter asjson and generate / render your own report by setting the options below:

const config: HardhatUserConfig = {
  gasReporter: {
    outputJSON: true,
    outputJSONFile: "benchmarks.json",
  }
}

This exposes everything you'd need to run more complex analyses. Some report generation examples you could use as a base for your own rendering script are:

Randomised Inputs

You might look into the fast-check package for this - it's a property-based testing harness with a good API for random value generation and it's agnostic about the test-framework (e.g can be integrated with mocha).

ryanc-bs commented 2 months ago

Thanks very much for the detailed response. I was not aware of the support for profiling pure functions, the inverse grep option or the ability to output gas data to JSON - I will try out these things!

I still think a separate benchmark mode with cases that are clearly differentiated from the functional tests would be nice, but I suppose that would be more of a feature request for Mocha rather than Hardhat.

cgewecke commented 2 months ago

I still think a separate benchmark mode with cases that are clearly differentiated from the functional tests would be nice

Agree! Think that's a great idea. It might make sense as part of a dedicated fuzzing feature here as well.