Tencent / tencent-ml-images

Largest multi-label image database; ResNet-101 model; 80.73% top-1 acc on ImageNet
Other
3.05k stars 514 forks source link

Tag Augmentation of Images #38

Open tigerzjh opened 5 years ago

tigerzjh commented 5 years ago

Thank you for the great work! 1、The paper says Note that each image from ImageNet-11K is annotated by a single tag.

Is each image just annotated by the leaf tag?

2、The paper says Besides, as some categories from Open Images are similar to or synonyms of above 10,032 categories, we merge these redundant categories into unique categories. If all tags of one image are removed, then this image is also abandoned. Consequently, 6,902,811 training images and 38,739 validation images are remained, covering 1,134 unique categories .

Is 6,902,811 training images and 38,739 validation images in 1,134 unique categories, which is meaning if any tag of a image from Open Images is similar to or synonyms of above 10,032 categories, then the image will be removed, or in 1,134 plus 10,032 unique categories?

thanks for your reply.

wubaoyuan commented 5 years ago

@xinagqi56

My answer: 1) Not really. Many tags ares are non-leaf. 2) If any tag of a image from Open Images is similar to or synonyms of above 10,032 categories, then the corresponding images from Open Images and ImageNet will be merged to the same category, rather than being removed.

tigerzjh commented 5 years ago

thanks for your answer. 1.You said many tags ares are non-leaf. Does that means one image may be included in multi class(ancestor and subclass)?

 2.You said If any tag of a image from Open Images is similar to or synonyms of above 10,032 categories, then the corresponding images from Open Images and ImageNet will be merged to the same category.
 Does that means one image may be merged to multi ImageNet class because every image in Open Image may has more than one tag?

 3.The paper says 
 Consequently, 6,902,811 training images and 38,739 validation images are remained, covering 1,134 unique categories .
 My understanding is that every image in the 6,902,811 training images and 38,739 validation images only have one tag, and the tag is in the 1134 unique categories. Is that correct?
wubaoyuan commented 5 years ago

@xinagqi56

1) Yes. As demonstrated in our README, in the whole database of ImageNet, there are some repeated images in different classes. The possible reason is that one image is assigned to its parent and child class simultaneously.

2) Keep in mind that this is multi-label annotations. If two tags from Open Images and ImageNet are synonyms, we will adopt one unique tag ID to represent this two tags. You just need to replace the original tags to the unique tag ID.

3) "covering 1,134 unique categories" means that all candidate tags are 1,134 non-synonym tags, not meaning the images are singly annotated. We will try to demonstrate this meaning more clearly in the updated manuscript.

tigerzjh commented 5 years ago

I have read the whole paper. The transfer experience said the performance using only Tencent ML-image is worse than only image net, while Tencent ML-image and image net is better than image net. What is your idea about JFT-300M is ok, which is also a multi class data? I want to try a embedding model, is pre-train on our data and then image net is a better choice?

wubaoyuan commented 5 years ago

@xinagqi56 The transfer performance is heavily dependent on the distribution difference between the source and the target data. If your target data is very small, I suggest to try the second checkpoint we provide in github, i.e., that is pre-trained on ML-Images then fine-tuned in ImageNet.