Closed linyq2117 closed 1 year ago
CLIP requires a logit scale (100) for softmax. Have you tried to add a logit scale like this: prob = (prob * 100).softmax(0)
The eval code will be released after acceptance, currently the major revision has been submitted.
Thanks for your reply! It stems from the lack of feature normalization for my negligence.
Thanks for your excellent work.
I failed to reproduce the multi-label recognition results in Table 7. For example, when I use CLIP ViT-B/16 with softmax function, I only got 35% mAP on NUS-Wide (42.85% in paper). I use the cls token of the original CLIP without feature surgery. Could you share the details and evaluation code of multi-label recognition?