sarahpratt / CuPL

170 stars 14 forks source link

non-unique class names #2

Closed mikeogezi closed 1 year ago

mikeogezi commented 1 year ago

Thank you for the work, Sarah.

I've noticed that the prompts JSON file uses the class names as keys. The class names are not unique. Specifically, there are two instances of "missile" (rocket and projectile) and two instances of "sunglasses" (sunglass and shades). The current setup gives both classes the exact same prompts and text embeddings.

When we do the argmax at the end, we always pick the earlier choice (based on the order of classes) and get 0% accuracy in the second one, since we never predict it.

We could fix this by using synset ids as keys and adding some context to disambiguate the duplicate class names to the prompt.

sarahpratt commented 1 year ago

Hello! Yes you are correct. For this work, I just grabbed the natural language labels used for ImageNet in the original CLIP work (can be found here: https://github.com/openai/CLIP/blob/main/notebooks/Prompt_Engineering_for_ImageNet.ipynb).

I agree that distinguishing between the classes with the same names would help improve accuracy -- and more generally, perhaps using different natural language labels altogether may also help! WordNet provides synonyms for each synset id so perhaps that may work for disambiguating.

I chose not to adjust these labels for the sake of comparison to the baseline, but this is a good idea to improve accuracy in the future!