Use minimum instead of mean for benchmarks

alecloudenback commented 1 year ago

Especially for noisy CI runs, the times are likely very noisy from background tasks, etc. E.g. this talk discusses what to use when benchmarking and concludes minimum is generally preferred: https://www.youtube.com/watch?v=vrfYLlR8X8k&t=951s

serenity4 commented 1 year ago

I agree that minimum time would be preferable, but unfortunately I did not find an easy way to do this for the Python benchmarks (haven't look at R). Most timing tools I found returned the average values without exposing individual trials. Any ideas? cc @MatthewCaseres

MatthewCaseres commented 1 year ago

For Python people seem to have a strategy that works - https://stackoverflow.com/questions/33588041/timeit-module-get-fastest-and-slowest-loop For R the discussion here backs up the idea that we should rather focus on minimum than on mean, should be possible - https://adv-r.hadley.nz/perf-measure.html#benchmark-results

Question is how many trials to perform, this is something we can probably tune by hand a bit so the CI doesn't take forever.

I know Python I hard-coded the number 20 in a few places - https://github.com/actuarialopensource/benchmarks/blob/main/Python/basicterm.py#L14

serenity4 commented 11 months ago

Fixed in #45.

actuarialopensource / benchmarks

Use minimum instead of mean for benchmarks #29