NB This does not to change the definition of a EMBENCH score.
At present the benchmarking system uses an option cpu_mhz for two (related) purposes:
to define the frequency of the benchmarked CPU
to multiply the baseline number of iterations performed by all benchmarks so that the execution time of those benchmarks is long enough that execution time measurements are large (seconds) compared to the quantum of time.
The baseline number of iterations has been set to provide a weighting between the benchmarks when computing the Embench score. The reported time for any benchmark is actual time / cpu_mhz. A 100 MHz processor runs 10x the number of iterations as a 10 MHz processor and this difference is accounted for by dividing the actual time accordingly.
This works well for similar processors using similar compiler optimismisations. When a processor or a compiler causes a benchmark to run very much faster (10x or even 100x) than the processor frequency would suggest, the actual run time generated by the cpu_mhz scaling becomes too short to be reliable. (Running on an M1 Mac three benchmarks run between 100x and 700x faster than frequency suggests). With the current system, the only way to overcome this is to set the cpu_mhz to be much higher (10x or 100x) than reality. This results in two problems:
the run time for the entire suite is increased greatly
the cpu_mhzparameter no longer reflects the actual frequency of the processor
This proposal is to add an mechanism which allows for individual benchmarks to have their iteration count increased, and the computation of their nominal run-time appropriately adjusted.
NB This does not to change the definition of a EMBENCH score.
At present the benchmarking system uses an option
cpu_mhz
for two (related) purposes:The baseline number of iterations has been set to provide a weighting between the benchmarks when computing the Embench score. The reported time for any benchmark is actual time /
cpu_mhz
. A 100 MHz processor runs 10x the number of iterations as a 10 MHz processor and this difference is accounted for by dividing the actual time accordingly.This works well for similar processors using similar compiler optimismisations. When a processor or a compiler causes a benchmark to run very much faster (10x or even 100x) than the processor frequency would suggest, the actual run time generated by the
cpu_mhz
scaling becomes too short to be reliable. (Running on an M1 Mac three benchmarks run between 100x and 700x faster than frequency suggests). With the current system, the only way to overcome this is to set thecpu_mhz
to be much higher (10x or 100x) than reality. This results in two problems:cpu_mhz
parameter no longer reflects the actual frequency of the processorThis proposal is to add an mechanism which allows for individual benchmarks to have their iteration count increased, and the computation of their nominal run-time appropriately adjusted.