salesforce / BLIP

PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
BSD 3-Clause "New" or "Revised" License
4.86k stars 648 forks source link

ITM threshold #46

Open chaochen99 opened 2 years ago

chaochen99 commented 2 years ago

HI, I am very fortunate to read your paper BLIP. It's very exciting.

I wonder how to set the ITM threshold when filtering?

Thanks! Looking forward to your reply.

LiJunnan1992 commented 2 years ago

We simply set it as 0.5, thanks.

chaochen99 commented 2 years ago

In train_retrieval.py, the score of ITM did not go through softmax. Did you consider using softmax at that time? When filtering, is the score of 0.5 used without softmax?

Thanks!

LiJunnan1992 commented 2 years ago

When filtering, the score of 0.5 uses softmax?

During inference of retrieval, softmax does not affect the ranking of ITM predictions, but it does affect the relative scale w.r.t the ITC similarity. If softmax is used for ITM scores, some additional scaling on the ITC score is also needed.