Jingkang50 / OpenOOD

Benchmarking Generalized Out-of-Distribution Detection
MIT License
858 stars 108 forks source link

confusion about the F-OOD evaluation metrics #230

Closed Esther-PAN closed 6 months ago

Esther-PAN commented 7 months ago

Thank you for your work. I'm a little confused about the F-OOD evaluation metrics. In the text it is written that the model will be exposed to ID samples during testing, which includes samples with covariate shift. However, the metrics do not address the ID ACC, but only focus on the FPR95\AUROC\AUPR when exposed to the OOD test set. so what metrics reflect the model's ability to generalize to samples with covariate shift? Please point out if my understanding of F-OOD is imperfect in any way, thanks!

zjysteven commented 7 months ago

Since covariate-shifted ID (csID) samples are considered as ID, all three OOD metrics can to some extent reflect OOD generalization capability: If the model does generalize to csID samples, then it should be able to separate (ID + csID), which have ID semantics, from OOD samples which do not. Meanwhile, we do report ID accuracy as one metric in the full result table here, which most straightforwardly reflect OOD generalization like you mentioned.