Open cmougan opened 1 year ago
The ClassifierDrift
detector is trained on a portion of the combined reference set x_ref
and test set x_test
. If the train_size
argument is a float between 0 and 1, then a random sample of size int(train_size * (len(x_ref) + len(x_test)))
from the combined data [x_ref, x_test]
is used for training. The held out fraction 1 - train_size
is then used for testing for drift. If we instead specify n_folds
as an int we apply cross-validation to ensure we leverage all the data for both training and out-of-sample testing. The n_folds
argument has priority over train_size
. This is clarified in the docs under the detector's usage section: https://docs.seldon.io/projects/alibi-detect/en/stable/cd/methods/classifierdrift.html#Usage
Thanks for the clarification and link!
I was thinking that perhaps we can improve the documentation either by extending or adding a link. What do you think?
In which data is the classifier drift trained? The documentation does not state it very clear.