Reference Drift Metrics

Yes, that's how it is done currently and we are aware it is not the optimum way. Good job on spotting that though 👏

So the correct way is: when calculating drift metric for a chunk which is a subset of the reference data, the observations that belong to that chunk should be "removed" from the reference data for the comparison. Just like in Cross Validation. Otherwise the some of the drift metrics are lower than they really should, because one dataset (reference chunk) is a subset of the other (whole reference). As an effect, in an extreme situation, one may have perfectly iid data, but the drift metrics on reference chunks will be lower than on monitored (analysis) data - yet with iid data they shouldn't.

We plan to fix this. Either by enforcing the new correct way or making it the default one, but keeping both and making the old way optional as it sometimes may be beneficial because of its lower computational cost. I can't say exactly when because our current focus is on research related to performance estimation methods.

Before we fix it, if you really want, you can hack it on your own - by fitting calculator multiple times on subsets of reference data that do not contain the reference chunk of interest.

NannyML / nannyml

Reference Drift Metrics #426