Open liuhl-source opened 3 years ago
I have the same problem. I can reproduce the performance of Image-text retrieval with"train-itm-flickr-base-16gpu-hn.json". But for "train-itm-coco-base-16gpu-hn.json", the result is , which is worse than the results in your paper. The above results are produced on 8 V100 gpus.
Thanks for your contribution! I run your code and can not reproduce the performance in your paper. There my results in two different setting: Finetuning Image-text retrieval with "train-itm-flickr-base-8gpu.json"
Finetuning Image-text retrieval with"train-itm-flickr-base-16gpu-hn.json"
There is a gap between the results and those in your paper. What's the difference between the experiment you do and the code you released? For example, training steps and learning rate. Thanks!