Closed michael-nml closed 6 months ago
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 78.67%. Comparing base (
13ace29
) to head (730ae35
).
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
This PR fixes an error when calculating business value, confusion matrix & specificity for binary classification problems where a chunk only contains 1 class.
Previously this would fail with:
This happens because the
sklearn.metrics.confusion_matrix
function NannyML uses internally bases its output on the number of classes present in the input. If only a single class is present, only 1 value is returned where we normally expect 4 for a binary classification problem. This PR resolves this by explicitly providing the expected classes in thelabels
argument. These expected classes are currently hard-coded as[0, 1]
but we may want to change this to derive values from the input if/when we support string-based classes for binary classification.Additionally, this PR resolves an issue with F1 sampling error calculation when there are no positive cases present in the input. This previously resulted in a
ZeroDivisionError
. Now it resolves theNaN
sampling error.