ibis-project / ibis

the portable Python dataframe library
https://ibis-project.org
Apache License 2.0
5.34k stars 600 forks source link

ci(perf): track benchmarks in ci #9653

Open cpcloud opened 4 months ago

cpcloud commented 4 months ago

We used to track benchmarks with a GitHub action, but it became incredibly slow to use the web page, and also caused clones of the main Ibis repo to be slow due to hosting the page on GH (and thus using the gh-pages branch).

We are currently running the benchmarks in CI and upload data to GCS as compressed Parquet files, but we're not doing anything with that data.

While benchmarking in CI has its downsides, I think there's value in using it to detect very large performance changes, e.g., those above 5x, so I think we should bring back some kind of performance regression testing.

I've recently been playing with https://bencher.dev to see if it's a viable alternative, and have been in contact with the maintainer. It seems promising, at the very least as a way to visualize and store the data without us having to manage it. It also supports alerting based on a few different benchmarking methodologies, but I haven't yet played with that.

Anyway, I plan to keep this simmering until I get all the existing benchmark data uploaded, which is currently blocked on https://github.com/bencherdev/bencher/issues/460.

epompeii commented 4 months ago

@cpcloud https://github.com/bencherdev/bencher/issues/460 has been resolved, so you should now be unblocked for uploading you all's historical benchmark data.

Please, let me know if you run into anything else!