Specifically, these are "missile" and "sunglasses" -- both of which appear 2 times in the provided label list. I've gone through and fixed the issues on my end by referencing Anish Athalye's imagenet-simple-labels: one "missile" label should be "projectile" and one "sunglasses" label should be "sunglass".
What I believe is the "correct" label list is here:
I was able to match your reported ViT-B/32 top-1 performance on ImageNet from the paper (63.2) using the "wrong" label set provided by this repo. However, fixing the missing labels yields a better top-1 of 63.47 (so +.27%). I imagine the numbers of the other models may change too.
Hello!
Apologies if this has already been addressed somewhere else, but I noticed that the ImageNet labels in the "Prompt Engineering for ImageNet" notebook: https://github.com/openai/CLIP/blob/main/notebooks/Prompt_Engineering_for_ImageNet.ipynb, contained two duplicates.
Specifically, these are "missile" and "sunglasses" -- both of which appear 2 times in the provided label list. I've gone through and fixed the issues on my end by referencing Anish Athalye's imagenet-simple-labels: one "missile" label should be "projectile" and one "sunglasses" label should be "sunglass".
What I believe is the "correct" label list is here:
I was able to match your reported ViT-B/32 top-1 performance on ImageNet from the paper (63.2) using the "wrong" label set provided by this repo. However, fixing the missing labels yields a better top-1 of 63.47 (so +.27%). I imagine the numbers of the other models may change too.
Best, George