Open simone-silvestri opened 9 months ago
I would keep the benchmarks simple and avoid a near-global ocean setup. The setups have to be maintained so its best if they are simple and easy to update when syntax changes. Also just for the purpose of setting up the pipeline, you probably only need one or two setups. Then we can incrementally build them up after we have observed that the pipeline is useful for at least a few days (if launching nightly).
Hopefully the benchmarks will be efficient enough to run nightly.
With the code being (slightly) optimized, we probably need a way to track performance across PRs and make sure we don't lose performance due to reasons we do not have much control over (for example changes in dependencies).
Over at SpeedyWeather.jl they are thinking to do the same and the package PkgBenchmark.jl was suggested as a way to simplify this implementation.
The question here is what would be the suitable candidate for a performance test, we could start with
The tests do not have to be enforced but can run nightly (or once per week) on the main branch, with the possibility of performing the tests before merging sensitive PRs