Closed jwoudenberg closed 6 years ago
FWIW I'm reducing these numbers down to two in the next version: runs per second and goodness of fit. Runs per second is pretty self-descriptive, but goodness of fit is not. In the new version, we vary sample size in order to generate a trend line, and goodness of fit is a measure of errors in the trend. It's expressed in terms of percent, and higher is better. So these advice will end up close to:
Also, the new approach solves these in the following ways:
In addition I'm adding lots of charts. Just looking at the data shows problems more often than you'd suspect, humans are very good at "hey, that's weird..." and not trusting the results. So for example, I can show the points. That shows outliers easily, as well as jags due to system spikes. If I show the trend line, it'll be obviously a good or bad fit (it's kinda susceptible to outliers.)
moved to elm-explorations/benchmark#4
In the same vein as the elm compiler it wouldn't be really nice if elm-benchmark gave us warning, errors, and tips to help us write better benchmarks. From working with Brian a bit, I know he has tons of context on this, part of which could be automatically distributed in the benchmark report.
Below is an outline of some of Brian's tips I remeber, to give an idea of the type of helpful messages that could be displayed.
<link>
."<link>
"