Open jgehrcke opened 1 year ago
We currently return a few (very underdocumented!) different comparisons from the /compare
APIs:
For example, the contender_z_score
metric is a numerical continuous outcome, which is then thresholded by the input API parameter threshold_z
to produce the binary contender_z_regression
metric. That system uses the whole past distribution to inform the metrics.
Separately, the change
metric only looks at the percent difference between baseline and contender, and is thresholded by the threshold
parameter to create the regression
metric.
(This comment is not declaring that this is the best API, but it's just as an inform of prior art. 🙂)
Based on https://github.com/conbench/conbench/issues/530.
One simple method we want to introduce here: comparing two multisample data points (baseline, and contender), using the uncertainty derived from multisampling, (plus potentially additional static tolerance). (new, basic method)
This should also report the result of the existing method that looks at more than one datapoint (https://github.com/conbench/conbench/issues/583).