sxzrt / CIFAR-10-W

CIFAR-10-Warehouse: Towards Broad and More Realistic Testbeds in Model Generalization Analysis
MIT License
16 stars 0 forks source link

Congrats and additional questions on the datasets you used in the study! #1

Open black0017 opened 2 weeks ago

black0017 commented 2 weeks ago

Hello and congrats for the amazing contribution to the community! I am really glad i discovered your work today.

I have 2 very simple questions from table 1 in https://arxiv.org/pdf/2310.04414 :

image
  1. What is CIFAR-10.1-C from Hendrycks & Dietterich, 2019 with 190K images? Is it just random 5K samples from each of the 19 datasets?
  2. CIFAR10-W consists of ~608K samples. Have you proposed any subset benchmark of this so that i can evaluate my classifier on? Would be curious to see how would you go about sampling from 180 different domains. Something around 190K images would be nice to be similar to the size of Hendrycks & Dietterich, 2019.

Your feedback is highly appreciated!

By the way in the readme's repo the gdrive link is hidden in the readme due to a small markdown error! You can very easily fix that :)

Thank you so much @sxzrt and have a nice day!

Nikolas.

sxzrt commented 2 weeks ago

Hi Nikolas,

Thank you very much for your kind words and your interest in CIFAR-10-W. I’ll do my best to provide you with the answers.

For question 1), we referred to the code from https://github.com/hendrycks/robustness/blob/8190fe329f5f072a06e3a2aea02bb5dda69aed9f/ImageNet-C/create_c/make_cifar_c.py#L422-L442 to generate the CIFAR-10.1-C dataset. CIFAR-10.1-C consists of 190,000 images since it includes 19 corruptions and each corruption has 10,000 images from CIFAR-10.1 dataset.

2) We do not have a fixed subset for evaluation because we believe that researchers can create subsets based on their specific needs. However, we attempted to randomly sample 100 images from each class across the 180 domains listed in the paper's Table 8 (If fewer than 100 images are available, we will use all the data for that particular class). If you want to use a subset, there are two options: you can directly follow the way we used in Table 8 or sample data based on the ordinal imbalance ratio.

Finally, thank you for pointing out the issue. If you have any questions, please feel free to reach out to us at any time. :)