Difference between code and description in the paper

Hi,

Thanks for open sourcing your code. I am trying to reproduce the results for ALBEF in your paper, but no success. I was going through your code and noticed that ITM logits/probabilities are used differently in the code than in the paper. Paper describes, "If the model score on the original text description is higher than the score on the generated negative samples, we regard it as positive output." However, in the code only the ITM logit corresponding to "matching" z[1] is used. Basically, the code never compares the scores between positive and negative text as described in the paper. Can you please clarify?

Thanks, Ajinkya

om-ai-lab / VL-CheckList

Difference between code and description in the paper #10