ocaml-bench / sandmark

A benchmark suite for the OCaml compiler
The Unlicense
82 stars 40 forks source link

Add @stedolan `markbench` benchmark for prefetching #457

Closed fabbing closed 1 year ago

fabbing commented 1 year ago

This PR, a joint effort with @MisterDA, adds a very slightly modified version of @stedolan markbench micro-benchmark. This micro-benchmark was first used in https://github.com/ocaml/ocaml/pull/10195 and then in https://github.com/ocaml/ocaml/pull/11827 to validate prefetching speedup while the GC is tracing blocks.

It could be useful as a sort of regression test running in Sandmark.

It would be preferable to use the reported seconds/GC time as calculated by the benchmark, which is reported on stdout, since it avoids accounting for setup time. How could this could be achieved in Sandmark?

punchagan commented 1 year ago

Thanks for the contribution, @fabbing and @MisterDA. As discussed in person, the benchmarks are currently run using different wrappers, like orun, perfstat or pausetimes.

But, the use case of being able to measure specific parts of a program as part of a benchmark makes sense to me. We do have such benchmarks being run on different repositories, using current-bench, which allows repositories to define their own custom benchmarks. I wonder if maybe that is a better place to have this benchmark? Or if we should add support for doing these kinds of benchmarking with micro-benchmarks in Sandmark.

@shakthimaan or @kayceesrk might have thoughts on this.

kayceesrk commented 1 year ago

But, the use case of being able to measure specific parts of a program as part of a benchmark makes sense to me.

Sandmark is not built for measuring and reporting on specific parts of the program. All the reported metrics are for the entire program. Breaking this invariant complicates how we measure, report and analyse the benchmarks in Sandmark. I am not keen on breaking this.

current-bench is probably the better place. The other option is to make the prefetching specific bits run much longer so that its effects dominate the program behaviour measured as a whole. This may be as simple as running the core parts of the algorithm repeatedly so that prefetching effects are magnified.

fabbing commented 1 year ago

current-bench is probably the better place. The other option is to make the prefetching specific bits run much longer so that its effects dominate the program behaviour measured as a whole. This may be as simple as running the core parts of the algorithm repeatedly so that prefetching effects are magnified.

The core part is actually already run several times (and can be easily tweaked with the launch parameter). The GC/tracing time of the benchmark largely dominates the setup time, so the benchmark can be kept as it is in Sandmark.

kayceesrk commented 1 year ago

Sounds great.

punchagan commented 1 year ago

Thanks for the contribution, @fabbing and @MisterDA ! I've merged it with a minor change - changed the tag from micro_bench to macro_bench to make sure that the benchmark runs in our nightly runs.