Closed debby1103 closed 1 year ago
The train set of Waterbirds? or validation set?
Hi! The Waterbirds dataset (https://github.com/kohpangwei/group_DRO) is constructed by cropping out birds from photos in the Caltech-UCSD Birds-200-2011 (CUB) dataset (Wah et al., 2011) and transferring them onto backgrounds from the Places dataset (Zhou et al., 2017). For evaluation, the ID val set of Waterbirds is used. Background spurious OOD test set (which contain spuriously correlated background images) can be downloaded here: https://drive.google.com/file/d/1CBe9f8yHIlQnXYmNQj45DsqB5vCT6qQ6/view?usp=share_link
Thanks for your help! Got the Waterbirds dataset (waterbird_complete95_forest2water2) and the OOD dataset. I found 1.2k validation samples(split=1) and use these images as ID samples, the OOD samples are all 10k images from Spurious OOD. I adopt 200 bird categories as ID classes. And the FPR95 score for MCM is much larger than expected (33.67 vs. 5.87), is there anything wrong with my experiment setup? Thanks again!
Hi! I just tested CUB (ID) vs. Placesbg (spurious OOD) and here are the results I get:
FPR95 AUROC AUPR
5.90 & 98.38 & 96.04
If I decrease the temperature (T) from 1 to 0.01. The results are significantly worse:
FPR95 AUROC AUPR
41.89 & 87.80 & 78.99
What is the T you used?
Oops I got you. I replicated the result of CUB (ID) vs. Placesbg (spurious OOD) successfully, as is 5.71 FPR95 with T=1 and 41.89 FPR95 with T=0.01. Grateful for your timely help!
Hello, I'm also encountering difficulties in reproducing spurious OOD. First, I'd like to ask about the mentioned CUB(ID) vs placebg(Spurious OOD). Does CUB refer to the validation set of waterbird_complete95_forest2water2, or the original CUB_200_2011 dataset? When I use CUB_200_2011 as the ID, I obtain results at 7.78/98.26, but when I use waterbird_complete95_forest2water2, the overall results significantly drop. I'm not sure if there's a mistake in the setup. Thank you!
Is there any configuration details or splitted datasets? Thanks!