p-lambda / wilds

A machine learning benchmark of in-the-wild distribution shifts, with data loaders, evaluators, and default models.
https://wilds.stanford.edu
MIT License
551 stars 128 forks source link

Calculation of OOD within the paper #125

Closed dalbertweiss closed 2 years ago

dalbertweiss commented 2 years ago

Hello together,

first of all I would like to thanks for making the paper and code to "WILDS: A Benchmark of in-the-Wild Distribution Shifts" publicly available. What caught my interest when reading the paper was the estimation of the in (IID) and out-of-distribution (OOD) which has been evaluated using empirical risk minimization (Table 1 page 20). My question is how the IIDs and OODs were calculated. Did you use the softmax with temperature scaling according to the paper "Enhancing the reliability of out of distribution image detection in neural networks"? If not, can you give reference to the way you tackled this problem?

Thank you in advance for your kindful reply

kohpangwei commented 2 years ago

No, we didn't use temperature scaling, and we're also not doing OOD detection. We trained the models listed in Section 5.3 of the paper (also see Appendix D for details), and then evaluated them on the ID and OOD test sets. Does that make sense?