This PR adds a new Github Actions workflow that runs our benchfx benchmarks on the PR and compares performance against the PR base (in most cases: main).
The necessary setup and actual running of benchmarks is performed using the benchfx harness.py.
The only minor complication is that I'm using the cache feature of Github Actions to speed up building things. Everything related to caching is extensively documented in the workflow file.
Currently, the workflow is set up so that any regression by more than 7% causes the job to fail. I've found the Github workers to be surprisingly stable (almost all of my test runs had performance differences of +-1 % when comparing the same Wasmtime commit against each other), but there was a single case when the difference was 5% due to noise.
This value can easily be updated if we find it to cause too many false failures. I don't have strong feelings about whether or not we make this job part of the branch protection rules (i.e., things that must succeed before merging). The CI job also makes it easy to see how much a PR improves performance the "Run benchmarks" step of the job prints an overview at the end. Another benefit outside of catching performance regressions is that this job would catch if a PR accidentally breaks any of the benchmarks.
This PR adds a new Github Actions workflow that runs our benchfx benchmarks on the PR and compares performance against the PR base (in most cases:
main
).The necessary setup and actual running of benchmarks is performed using the benchfx
harness.py
.The only minor complication is that I'm using the cache feature of Github Actions to speed up building things. Everything related to caching is extensively documented in the workflow file.
Currently, the workflow is set up so that any regression by more than 7% causes the job to fail. I've found the Github workers to be surprisingly stable (almost all of my test runs had performance differences of +-1 % when comparing the same Wasmtime commit against each other), but there was a single case when the difference was 5% due to noise.
This value can easily be updated if we find it to cause too many false failures. I don't have strong feelings about whether or not we make this job part of the branch protection rules (i.e., things that must succeed before merging). The CI job also makes it easy to see how much a PR improves performance the "Run benchmarks" step of the job prints an overview at the end. Another benefit outside of catching performance regressions is that this job would catch if a PR accidentally breaks any of the benchmarks.
This resolves #28