moein-shariatnia / OpenAI-CLIP

Simple implementation of OpenAI CLIP model in PyTorch.
MIT License
623 stars 92 forks source link

Why is the n*5? #1

Closed qiaogh97 closed 3 years ago

qiaogh97 commented 3 years ago

Hi @moein-shariatnia ! Thank you for your contribution! I really enjoy your article. There is something I can't understand about this code https://github.com/moein-shariatnia/OpenAI-CLIP/blob/8fda94c1f85f956bdadb2e796938356fd79ae336/inference.py#L46 I mean if you want to choose the top5, why not torch.topk(dot_similarity.squeeze(0), 5)?

moein-shariatnia commented 3 years ago

Hey @qiao1025566574 , I'm glad you liked it! That's a good question! I remember that I had put a comment to explain why n * 5 but I don't know why it's not there; my fault! The reason is that the dataset has 5 captions for each image; so when computing the similarities, every 5 entries in dot_similarity point to same image. In order to show n different images in the plot, I'm skipping every 5 entry with [::5] in the next line. I admit that I did this not optimally. A better way was to only put the unique images in the CSV file for visualization purposes. I hope this explains that part.

qiaogh97 commented 3 years ago

Great!I know it! I'm pleasantly surprised by your timely reply, thanks a lot. I will close this issue.