huggingface / datasets

🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
https://huggingface.co/docs/datasets
Apache License 2.0
19.22k stars 2.68k forks source link

Add a robustness benchmark dataset for vision #5371

Open sayakpaul opened 1 year ago

sayakpaul commented 1 year ago

Name

ImageNet-C

Paper

Benchmarking Neural Network Robustness to Common Corruptions and Perturbations

Data

https://github.com/hendrycks/robustness

Motivation

It's a known fact that vision models are brittle when they meet with slightly corrupted and perturbed data. This is also correlated to the robustness aspects of vision models.

Researchers use different benchmark datasets to evaluate the robustness aspects of vision models. ImageNet-C is one of them.

Having this dataset in 🤗 Datasets would allow researchers to evaluate and study the robustness aspects of vision models. Since the metric associated with these evaluations is top-1 accuracy, researchers should be able to easily take advantage of the evaluation benchmarks on the Hub and perform comprehensive reporting.

ImageNet-C is a large dataset. Once it's in, it can act as a reference and we can also reach out to the authors of the other robustness benchmark datasets in vision, such as ObjectNet, WILDS, Metashift, etc. These datasets cater to different aspects. For example, ObjectNet is related to assessing how well a model performs under sub-population shifts.

Related thread: https://huggingface.slack.com/archives/C036H4A5U8Z/p1669994598060499

sayakpaul commented 1 year ago

Ccing @nazneenrajani @lvwerra @osanseviero