chore(ci): onboard bencher for tracking benchmarks

orhun commented 3 weeks ago

closes #1092

https://bencher.dev/console/projects/ratatui-org

I'm not sure how this works with PRs, should we run it only when pushed on main? (cc @epompeii)

@ratatui-org/maintainers I can send an invite through email for the bencher organization, I'm guessing no registration is needed.

orhun commented 3 weeks ago

Uuh, I'm 99% sure I'm using the correct token but something is wrong 🤔

epompeii commented 3 weeks ago

I'm not sure how this works with PRs, should we run it only when pushed on main? (cc @epompeii)

For the general concepts of working with feature branches:

For an example of how to benchmark PRs from forks in GitHub Actions:

@ratatui-org/maintainers I can send an invite through email for the bencher organization, I'm guessing no registration is needed.

Whoever you invite can signup for a Bencher account with the invite. They don't need to already be a user.

epompeii commented 3 weeks ago

Uuh, I'm 99% sure I'm using the correct token but something is wrong 🤔

Have you set the BENCHER_API_TOKEN as repository secret for ratatui-org/ratatui?

🐰 Make sure you have created an API token and set it as a Repository secret named BENCHER_API_TOKEN before continuing on! Navigate to Your Repo -> Settings -> Secrets and variables -> Actions -> New repository secret. Name the secret BENCHER_API_TOKEN and set the secret value to your API token. https://bencher.dev/docs/how-to/github-actions/

orhun commented 3 weeks ago

Yup, secret is there as BENCHER_API_TOKEN

orhun commented 3 weeks ago

I can't add another token or delete the existing one. I'm using the same one that I used in my tests in #1092

epompeii commented 3 weeks ago

I believe the issue is that you are trying to run this with a pull request from a fork:

With the exception of GITHUB_TOKEN, secrets are not passed to the runner when a workflow is triggered from a forked repository. https://docs.github.com/en/actions/security-guides/using-secrets-in-github-actions#using-secrets-in-a-workflow

orhun commented 3 weeks ago

Ah I see. Then I would like to wait for other maintainer's input to decide whether if we want to run Bencher for pull requests as well (does it have any major benefits?) or just pushes to the main branch.

Also, running the cargo bench for PRs might slow down the CI.

epompeii commented 3 weeks ago

does it have any major benefits?

Yes, one of the major benefits of Continuous Benchmarking is being able to catch performance regressions in CI. Just tracking main is better than nothing. However, I would recommend benchmarking PRs.

Also, running the cargo bench for PRs might slow down the CI.

This can definitely be the case if your benchmarks take a while. I just pulled ratatui and ran the benchmarks locally, and they are still going...

One solution is to only run the benchmarks on PRs if there is a bencher label added. This is how Diesel set things up.

So something like this for the Bencher PR job: https://github.com/diesel-rs/diesel/pull/3849/files#diff-e0ba2f20a84be928e389320224a40fc1d8a83da5fdbcbd535d074438715f3bb6R9

EdJoPaTo commented 3 weeks ago

Ah I see. Then I would like to wait for other maintainer's input to decide whether if we want to run Bencher for pull requests as well (does it have any major benefits?) or just pushes to the main branch.

Changes should be detected before they are merged. Otherwise, we could just bisect a regression.

Also, running the cargo bench for PRs might slow down the CI.

Personally I think PRs should be viewed by multiple people anyway so they take like at least 30h to be merged anyway. That's quite enough time for benches to run.

orhun commented 1 week ago

Hey @epompeii - I just had some time to come back to this again.

I see two possible ways of moving forward:

Adopting Diesel's setup: This will mean that we will have 2 workflow files, one of them will run/save the benches for PRs and we will upload them at the other workflow when the PR is merged. This sounds cool, but I couldn't find a way to save the criterion result as JSON. In the examples it uses --adapter json, but in our case that's criterion. Can you give me some tips about saving the benchmark result and loading it up into bencher afterwards? (maybe I should use --save-baseline, not sure).
Using GitHub environments: We can create internal/external environments as show in the documentation but the problem is it requires a list of reviewers. Of course, I can add everyone manually there but I wish there was a way to point it to CODEOWNERS instead. This approach is kind of new to me due to environments but it seems like it might work.

Any ideas/opinions?

epompeii commented 1 week ago

@orhun based on the feedback from @EdJoPaTo, my suggestion would be to implement statistical continuous benchmarking. This differs from the approach that Diesel uses, relative continuous benchmarking.

In order to implement statistical continuous benchmarking in GitHub Actions, you need to:

This will require three workflow files: one for the first bullet and two for the second bullet.

This sounds cool, but I couldn't find a way to save the criterion result as JSON. In the examples it uses --adapter json, but in our case that's criterion. Can you give me some tips about saving the benchmark result and loading it up into bencher afterwards? (maybe I should use --save-baseline, not sure).

Bencher can parse the output from Criterion using the rust_criterion adapter.

When tracking your base branch, you can just have Bencher do this directly from stdout:

bencher run --adapter rust_criterion "cargo bench"

For benchmarking PRs, you will need to save the Criterion results to a file in the first step in the Run and Cache Benchmarks workflow:

cargo bench > results.txt

Then in second step, you can upload those results to Bencher using the --file option in the Track Benchmarks with Bencher workflow:

bencher run \
...
--file "$BENCHMARK_RESULTS"

If you go with this approach, you don't have to worry about using GitHub environments. I do not recommend using this approach.

orhun commented 1 week ago

Thanks for the tip! That sounds good, only part that daunts me is having 3 workflows at the end of the day 💀 This sounds like it will make things a bit cumbersome.

Is there any other projects that use statistical continuous benchmarking which we can adopt their workflows?

epompeii commented 1 week ago

Is there any other projects that use statistical continuous benchmarking which we can adopt their workflows?

There are several. I think it would be easiest to just look at the example repo.

Benchmark base branch
PR from Forks
- Run PR benchmarks
- Track PR benchmarks

orhun commented 1 week ago

Thanks for the tips @epompeii!

I wish organizing workflows in a subdirectory was possible, but I guess it is not yet.

In the meantime I adopted the example for Ratatui which seems like it is gonna work!

orhun commented 1 week ago

Can I get a review on this? @ratatui-org/maintainers

I think the benchmark result will be uploaded after merge.

joshka commented 1 week ago

Can I get a review on this? @ratatui-org/maintainers

I think the benchmark result will be uploaded after merge.

I have been delaying reviewing this because I know it requires a bit more deep understanding than a superficial one. Will get to it soon.

ratatui-org / ratatui

chore(ci): onboard bencher for tracking benchmarks #1174