elijahcole / single-positive-multi-label

Multi-Label Learning from Single Positive Labels - CVPR 2021
https://arxiv.org/abs/2106.09708
MIT License
89 stars 18 forks source link

Some images miss labels in NUS-WIDE dataset #7

Closed coldmanck closed 2 years ago

coldmanck commented 2 years ago

Hi @elijahcole,

Thanks for your great work. I found that there are some images in the NUS-WIDE dataset that do not come with any label (not to mention multiple labels). Specifically, 33410 images out of 150000 (22.27%) in training split, and 13412 images out of 60260 (22.26%) in testing split, miss their labels. And then in your experiment, you seem to have just assumed these images do not belong to any class (as all classes are zeroes). Is there any missing here?

elijahcole commented 2 years ago

Hi there! I believe that is correct.

The NUSWIDE dataset does seem to have a number of images for which none of the 81 concept labels apply. In the original NUSWIDE, it seems that there are 36340 train images and 23961 test images for which all labels are negative. For this paper we re-crawled the NUSWIDE dataset in accordance with the procedure in Durand et al. CVPR 2019. Since some images have been deleted by their owners since the release of NUSWIDE it makes sense that the numbers you found are a bit lower than those for the original NUSWIDE.