Closed yogendra-yatnalkar closed 2 years ago
Hi @yogendra-yatnalkar , thanks for the kind words. A few thoughts on scaling drift detectors to millions of images:
Please also note that it is typically not recommended to use univariate detectors such as the K-S detector on high-dimensional data such as images, although I am glad it seems to be working well in this case. :)
@arnaudvl Thankyou for the detailed explanation. Understood the point that if we need to detect drift, we can sample a big dataset and work with it.
Before closing this thread would like to ask a beginners question, what detector would you prefer for Image related task ? Thanking you in advance.
Definitely not a beginner question since it can be quite tricky! It depends a bit on what you are trying to detect drift on. As explained here, we can understand drift as a change in P(x,y) (with x the input images and y the ground truth) between the reference and test data. This change can happen because P(x) changed (covariate shift), P(y) changed (target drift) or P(y|x) changed (concept drift). Note that multiple types of drift can happen at once! So it's usually a good idea to take that into account when setting up a monitoring system.
We often cannot directly detect changes in P(y) since we typically don't have immediate access to the test data ground truth. We can however proxy this via e.g. detecting drift on the model prediction distribution. So if the model predicts classes (univariate), use the Chi^2 detector, if it predicts a probability distribution, you can use the K-S detector for low dimensions (e.g. binary) or the MMD or LSDD detectors. The latter two also have online equivalents (MMD online and LSDD online). Check the benefits of online detectors here. Another useful detector on model predictions is the uncertainty drift detector, which can serve as a proxy for model performance deterioration. Note that these methods do not rely on the data modality of the input, just the model predictions.
To detect possible changes in P(x) where x consists of images, we typically want to apply a dimensionality reduction step first. More on that here. For images this can be for instance the encodings from a pretrained autoencoder. This notebook contains a worked example on medical image data doing just that. Then almost any detector can be used (e.g. MMD or LSDD) on the encodings. Alternatively, if you don't want this encoding step, you can directly train a domain classifier which learns to distinguish the reference from the test set or use a (deep) learned kernel. An example for both on images can be found here.
Lastly, if you do have access to the labels, then you can directly apply a supervised drift detector such as the Cramer-von Mises detector or Fisher's Exact Test. These methods again don't rely on the data modality of the input, just on the labels.
Hi team, first of all, thanks a lot for the amazing work.
I was working on drift detection for image dataset and used KS drift detection on a small dataset for prototyping. It worked great but I was thinking, if in future I have a dataset containing millions of images, what can be done?
For training the drift artifacts on large datasets, it would not be possible to load these many images into memory. Is there any way to deal with this ?
Please note: I see it can be done using two way but did not find support for the below points. Please let me know if its possible ?