Open matanor opened 8 months ago
Today we have a mechanism for disabling confidence interval calculation, by setting n_resamples to None. That mechanism is used as the implementation of a command line parameter in FM-Eval.
There is also a mechanism for specifying a list of confidence interval scores, on which the confidence intervals are computed. This is implemented for instance metrics.
The suggestion is that the enable/disable mechanism of the confidence interval computation will be implemented only with the list of score names, with an empty list to indicate no computation. The n_resamples flag will no longer support a value of None.
for which metrics CI is disabled? and why?
I can see why latency can become an issue, but this is the case only for global metrics. For instance metrics, the CI computation should be very fast.
Today confidence intervals are computed by default for the main_score. This PR adds the capability of computing confidence intervals for additional scores.
We would like to change the confidence interval default, such that is is not computed by default, but rather it is only computed when explicitly stated in the metric.