gnolang / gno

Gno: An interpreted, stack-based Go virtual machine to build succinct and composable apps + Gno.land: a blockchain for timeless code and fair open-source.
https://gno.land/
Other
892 stars 371 forks source link

META: Benchmarks #2919

Open ajnavarro opened 1 week ago

ajnavarro commented 1 week ago

Description

Benchmark Results

Related issues:

Now that we have a Benchmark MVP working, I'd like to enumerate the requests from several parties regarding the next steps:

Tasks

Benchmark Tools

Currently, two GitHub Actions can run Go benchmarks:

If none of these tools meets all our requirements, we should consider developing a new action fitting our use cases.

This is a call for comments (cc @moul and @thehowl )

sw360cab commented 1 week ago

The options Execute Benchmarks on Pull Requests can be reconsidered disabling the option: failOnAlert

sw360cab commented 1 week ago

Consider #2915

thehowl commented 1 week ago

Run Benchmarks Only on Modified Packages: This approach requires further thought, as it might not be the most effective solution.

Yes, I'd rather we start off from having a small set of "core" benchmarks which are run on PRs.

Comparisons with master using benchstat would be nice.

thehowl commented 1 week ago

Run a Subset of Benchmarks: Use the -short flag to execute a smaller set of benchmarks. While this won't completely prevent regressions in the master branch, it will reduce their frequency.

IMO this could also be just a list in the workflow YAML file; using -short means that benchmarks are by default included, which is not ideal because I'll definitely forget it when I review a PR adding a benchmark. Instead, having a list of benchmarks ensures they have to be explicitly added.

The non-essential benchmarks we can disable with build tags or flags maybe?

Rest LGTM. :)

ajnavarro commented 1 week ago

IMO this could also be just a list in the workflow YAML file

That might be an extra step that is difficult to know from all devs. That is how it was implemented before, you needed to add your benchmark in a yaml file. To reduce the hassle to the minimum, I had to:

To avoid repeating the same mistakes, I propose using a specific tag similar to -short but used to indicate whether that benchmark will run regularly or not. It can be --ci

@thehowl WDYT?

moul commented 1 week ago

I believe we should establish two levels of benchmarks:

  1. The first level is enforced quietly and applies to all developers, including those writing contracts or fixing typos in a README. There should be no noise unless a PR triggers a global benchmark issue. In that case, the PR will become a topic of discussion regarding benchmarks. We could implement high-level benchmarks, such as running a blockchain node and processing N transactions. While this may not identify the exact cause of any slowdown, knowing that overall performance has decreased is sufficient for the core team to pause a PR until we investigate further. Developers should not have to check a box or worry about benchmarks; they should be able to continue their work without concern until a CI check or the core team blocks the merge for performance reasons.

  2. The second level is for PRs that introduce features we want to benchmark from the outset, primarily in tm2 and gnovm, or for those focused on adding benchmarks rather than features. These PRs should prioritize benchmark configuration management, including defaults for CI and local setups.

However, upon reviewing the PR history, it’s unclear if we can conclude that everyone ignored benchmarks; most PRs were simply unrelated to them (case 1). Ignoring benchmarks is definitely more problematic in case 2.


A quick win that I suggest includes:

ajnavarro commented 1 week ago

Let's summarize all the requests:

Requests Summary

Benchmark Execution on Pull Requests

Ping me if I missed something or some point needs more clarification/the output I got is wrong.

aeddi commented 3 days ago

Working on it right now. Let me get a PoC ready and then we’ll figure out a specific rule to meet this need. :)