Closed paulnovello closed 1 year ago
current status: ✅
Overall Coverage
Lines Covered Coverage Threshold Status 2117 1958 92% 70% 🟢 New Files
File Coverage Status oodeel/methods/gram.py 98% 🟢 TOTAL 98% 🟢 Modified Files
File Coverage Status oodeel/init.py 100% 🟢 oodeel/eval/plots/features.py 94% 🟢 oodeel/eval/plots/plotly.py 92% 🟢 oodeel/extractor/feature_extractor.py 94% 🟢 oodeel/extractor/keras_feature_extractor.py 96% 🟢 oodeel/extractor/torch_feature_extractor.py 98% 🟢 oodeel/methods/init.py 100% 🟢 oodeel/methods/base.py 83% 🟢 oodeel/methods/dknn.py 100% 🟢 oodeel/methods/mahalanobis.py 98% 🟢 oodeel/methods/vim.py 86% 🟢 oodeel/utils/operator.py 100% 🟢 oodeel/utils/tf_operator.py 94% 🟢 oodeel/utils/tf_training_tools.py 75% 🟢 oodeel/utils/torch_operator.py 94% 🟢 oodeel/utils/torch_training_tools.py 92% 🟢 TOTAL 94% 🟢 updated for commit:
8363e52
by action🐍
LGTM
Implementation of Gram baseline from "Detecting Out-of-Distribution Examples with Gram Matrices" link
While implementing this class, we added a new feature to the FeatureExtractor class, which is the ability to preprocess feature maps on the fly, batch-wise, and to directly return processed features as output of
.predict_tensor()
. This is crucial for this method since the computations performed on feature maps are quite intensive and would lead to OOM if not applied batch-wise. This feature will be useful for other baselines, e.g. DKNN or mahalanobis (or the planned NMD) to be able to process internal feature maps. In that case, thepostproc_fn
could be some pooling + flatten.Important Disclaimer: Taking the statistics of min/max deviation, as in the paper raises some problems.
The method often yields a score of zero for some tasks. This is expected since the min/max among the samples of a random variable becomes more and more extreme with the sample size. As a result, computing the min/max over the training set is likely to produce min/max values that are so extreme that none of the in distribution correlations of the validation set goes beyond these threshold. The worst is that a significant part of ood data does not exceed the thresholds either. This can be alleviated by computing the min/max over a limited number of sample. However, it is counter-intuitive and, in our opinion, not desirable: adding some more information should only improve a method.
Hence, we decided to replace the min/max by the q / 1-q quantile, with q a new parameter of the method. Specifically, instead of the deviation as defined in eq. 3 of the paper, we use the definition
With this new deviation, the more points we add, the more accurate the quantile becomes. In addition, the method can be made more or less discriminative by toggling the value of q.
Finally, we found that this approach improved the performance of the baseline in our experiments.