It's a known fact that vision models are brittle when they meet with slightly corrupted and perturbed data. This is also correlated to the robustness aspects of vision models.
Researchers use different benchmark datasets to evaluate the robustness aspects of vision models. ImageNet-C is one of them.
Having this dataset in 🤗 Datasets would allow researchers to evaluate and study the robustness aspects of vision models. Since the metric associated with these evaluations is top-1 accuracy, researchers should be able to easily take advantage of the evaluation benchmarks on the Hub and perform comprehensive reporting.
ImageNet-C is a large dataset. Once it's in, it can act as a reference and we can also reach out to the authors of the other robustness benchmark datasets in vision, such as ObjectNet, WILDS, Metashift, etc. These datasets cater to different aspects. For example, ObjectNet is related to assessing how well a model performs under sub-population shifts.
Name
ImageNet-C
Paper
Benchmarking Neural Network Robustness to Common Corruptions and Perturbations
Data
https://github.com/hendrycks/robustness
Motivation
It's a known fact that vision models are brittle when they meet with slightly corrupted and perturbed data. This is also correlated to the robustness aspects of vision models.
Researchers use different benchmark datasets to evaluate the robustness aspects of vision models. ImageNet-C is one of them.
Having this dataset in 🤗 Datasets would allow researchers to evaluate and study the robustness aspects of vision models. Since the metric associated with these evaluations is top-1 accuracy, researchers should be able to easily take advantage of the evaluation benchmarks on the Hub and perform comprehensive reporting.
ImageNet-C is a large dataset. Once it's in, it can act as a reference and we can also reach out to the authors of the other robustness benchmark datasets in vision, such as ObjectNet, WILDS, Metashift, etc. These datasets cater to different aspects. For example, ObjectNet is related to assessing how well a model performs under sub-population shifts.
Related thread: https://huggingface.slack.com/archives/C036H4A5U8Z/p1669994598060499