This PR adds new command line option --stopping-criterion <criterion> with two predefined criteria stdrel and entropy along with API for customization of the stopping criterion. The nvbench/examples/custom_criterion.cu illustrates how custom criteria can be added on per-run basis. This opens possibilities for performance CI improvements. One can now develop criteria that, for instance, collects large sample, store the sample size and then on each re-run of performance CI loads this number, leading to better stability.
Apart from new API, entropy criterion is introduced. To enable it, it's sufficient to write --stopping-criterion entropy. The criterion computes cumulative entropy of the sample and stores it in an entropy window. Then, linear regression on the cumulative entropy window is computed. If the angle of the linear regression is small enough and coefficient of determination (R^2) is large enough, criterion believes that new samples will not introduce any new information and the sample is representative. Entropy criterion addresses concerns from https://github.com/NVIDIA/nvbench/issues/150 and https://github.com/NVIDIA/nvbench/issues/147 as well as significantly reduces variation of sample size, which is important for performance CI. Below is a plot of sample size distribution for stdrel and entropy criteria collected on nvbench/examples/throughput.cu that illustrates this point:
Below is an example where stdrel noticed small variance and decided to stop, but entropy noticed that entropy grows and kept sampling, discovering new modes:
Other times, entropy notices that new measurements do not introduce anything new to the sample and stops earlier:
Each criterion has its own set of parameters. Parameters like --max-noise and --min-time only affect stdrel criterion, whereas --max-angle and --min-r2 are parameters of entropy.
For now, stdrel stays as default criterion. Decision on switching the default criterion will be made after some field experience.
Closes https://github.com/NVIDIA/nvbench/issues/150 and https://github.com/NVIDIA/nvbench/issues/147.
This PR adds new command line option
--stopping-criterion <criterion>
with two predefined criteriastdrel
andentropy
along with API for customization of the stopping criterion. Thenvbench/examples/custom_criterion.cu
illustrates how custom criteria can be added on per-run basis. This opens possibilities for performance CI improvements. One can now develop criteria that, for instance, collects large sample, store the sample size and then on each re-run of performance CI loads this number, leading to better stability.Apart from new API,
entropy
criterion is introduced. To enable it, it's sufficient to write--stopping-criterion entropy
. The criterion computes cumulative entropy of the sample and stores it in an entropy window. Then, linear regression on the cumulative entropy window is computed. If the angle of the linear regression is small enough and coefficient of determination (R^2) is large enough, criterion believes that new samples will not introduce any new information and the sample is representative. Entropy criterion addresses concerns from https://github.com/NVIDIA/nvbench/issues/150 and https://github.com/NVIDIA/nvbench/issues/147 as well as significantly reduces variation of sample size, which is important for performance CI. Below is a plot of sample size distribution forstdrel
andentropy
criteria collected onnvbench/examples/throughput.cu
that illustrates this point:Below is an example where
stdrel
noticed small variance and decided to stop, butentropy
noticed that entropy grows and kept sampling, discovering new modes:Other times,
entropy
notices that new measurements do not introduce anything new to the sample and stops earlier:Each criterion has its own set of parameters. Parameters like
--max-noise
and--min-time
only affectstdrel
criterion, whereas--max-angle
and--min-r2
are parameters ofentropy
.For now,
stdrel
stays as default criterion. Decision on switching the default criterion will be made after some field experience.