Open fako1024 opened 1 year ago
Got two successful (sub) benchdiffs now that a dedicate machine / runner is available, e.g. here. However, the spread of the benchmarks is still at least problematic (of the order of 3% standard deviation). The actions run with niceness -19, but I guess there's still some cache misses or similar things going on under the hood that prevent a fully stable measurement. Will continue digging...
I think I can now quite nicely use CPU isolation and taskset
to aggressively separate OS level CPU usage / IRQs and the actual test(s) / benchmark(s).
However, with the current GitHub action from the marketplace I can only assign the isolated CPUs to the whole GitHub runner (not the actual test binary as I'd like). A current test if that suffices is running (but I doubt it). Probably the most consistent results can be achieved by writing a small benchmark script for the repo that does the checkout + benchmarks + benchstat (which probably is a good idea anyway due to some deprecation in the marketplace action) and therein use the isolated CPUs for the benchmark run, and for that only.
In order to track side-effects of changes on performance it would be enormously helpful to have automated benchmarks / comparison via
benchstat
as part of the CI pipeline (maybe not on each commit, but e.g. on filing a PR). Of course this is inherently difficult (because of reproducibility or rather lack thereof), but I've seen other projects do it.DoD