Closed yxoh closed 2 years ago
Yes, using ITC is faster but less accurate.
ut less accurate
Are there experiments that show how much the accuracy rate has dropped?
ut less accurate
Are there experiments that show how much the accuracy rate has dropped?
Ah, I saw the experiment in the paper:)
I downloaded the data JSON files and pretrained ALBEF model (4M) from this repo. I run an image-text retrieval task. The zero-shot results on the flickr30k dataset are TR (R@1: 84.9, R@5: 97.2, R@10:99.0); IR (R@1: 68.18, R@5: 88.58, R@10: 93.02). But in the paper, the results are TR (R@1: 90.5, R@5: 98.8, R@10:99.7); IR (R@1: 76.8, R@5: 93.7, R@10: 96.7). How can I reproduce the same results in the paper?
The flickr zero-shot results are obtained using the coco-finetuned model
The flickr zero-shot results are obtained using the coco-finetuned model
It helps me. Thanks:)
I saw the original setting use the ITM score s{itm} for ranking, but it has more calculations. Is it ok that we only use feature similarity score s{itc} for ranking during inference?