simphotonics / benchmark_runner

A library for writing inline micro-benchmarks, reporting score statistics, and running sync/async benchmarks.
https://pub.dev/packages/benchmark_runner
BSD 3-Clause "New" or "Revised" License
2 stars 0 forks source link
async benchmark dart execution-time runner score statistics

Benchmark Runner

Dart

Introduction

Benchmarking is used to estimate and compare the execution speed of numerical algorithms and programs. The package benchmark_runner is based on benchmark_harness and includes helper functions for writing inline micro-benchmarks with the option of printing a score histogram and reporting the score mean ± standard deviation, and score median ± inter quartile range.

The benchmark runner allows executing several benchmark files and reports if uncaught exceptions/errors were encountered.

Usage

Include benchmark_runner as a dev_dependency in your pubspec.yaml file.

Write inline benchmarks using the functions:

The console output is shown above. The following colours and coding are used:

2. Running Several Benchmark Files

To run several benchmark files (with the format*_benchmark.dart) invoke the benchmark_runner and specify a directory. If no directory is specified, it defaults to benchmark:

$ dart run benchmark_runner

Console Output

A typical console output is shown above. In this example, the benchmark_runner detected two benchmark files, ran the micro-benchmarks and produced a report.

Tips and Tricks

Score Sampling

In order to calculate benchmark score statistics a sample of scores is required. The question is how to generate the score sample while minimizing systematic errors (like overheads) and keeping the benchmark run times within acceptable limits.

To estimate the benchmark score the functions warmup or warmupAsync are run for 200 milliseconds.

1. Default Sampling Method

The graph below shows the sample size (orange curve) as calculated by the function BenchmarkHelper.sampleSize. The green curve shows the lower limit of the total microbenchmark duration and represents the value: clockTicks * sampleSize * innerIterations.

Sample Size

For short run times below 100000 clock ticks each sample score is generated using the functions measure or the equivalent asynchronous method measureAsync. The parameter ticks used when calling the functions measure and measureAsync is chosen such that the benchmark score is averaged over (see the cyan curve in the graph above):

2. Custom Sampling Method

To amend the score sampling process the static function BenchmarkHelper.sampleSize can be replaced with a custom function:

BenchmarkHelper.sampleSize = (int clockTicks) {
  return (outer: 100, inner: 1)
}

To restore the default score sampling settings use:

BenchmarkHelper.sampleSize = BenchmarkHelper.sampleSizeDefault;

The graph shown above may be re-generated using the custom sampleSize function by copying and amending the file gnuplot/sample_size.dart and using the command:

dart sample_size.dart

The command above lauches a process and runs a gnuplot script. For this reason, the program gnuplot must be installed (with the qt terminal enabled).

Contributions

Help and enhancement requests are welcome. Please file requests via the issue tracker.

The To-Do list currently includes:

Features and bugs

Please file feature requests and bugs at the issue tracker.