Add benchmarks to measure Inko performance over time #335

Open yorickpeterse opened 2 years ago

To measure the impact of changes on Inko's performance, we need a benchmark suite. The benchmark suite would exist as a separate repository

The benchmark suite should consist of two types of benchmarks: micro benchmarks and macro benchmarks. A micro benchmark would be something like DeltaBlue (https://github.com/wren-lang/wren/blob/main/test/benchmark/delta_blue.py), while a macro benchmark would be something like a simple HTTP server. Ideally we choose a set of micro benchmarks also used by other languages, so we can use them to see how well we're doing compared to other languages.

Automatic benchmarking

A CI job would run the suite periodically (e.g. once a week). The results should be presented somewhere that's easily accessible. Running this on GitLab's shared runners is likely to give inconsistent results, so we probably need a dedicated runner for this. As I don't have any spare computer I can run at home 24/7, this would likely involve renting a server. A quick look at Hetzner suggests this would cost around €50/month. To recoup the costs we should probably reuse the runner for other jobs (e.g. FreeBSD tests using QEMU), otherwise it's a bit of a waste of money.

Worth mentioning: often these micro benchmarks end up measuring completely unrelated code, such as the time it takes to write to STDOUT (something the Wren benchmarks suffer from). When adopting existing benchmarks we should make sure we're actually measuring what matters, instead of blindly copying the benchmarks.

inko-lang / inko

Add benchmarks to measure Inko performance over time #335

Automatic benchmarking