ChenRocks / UNITER

Research code for ECCV 2020 paper "UNITER: UNiversal Image-TExt Representation Learning"
https://arxiv.org/abs/1909.11740
779 stars 109 forks source link

Image-text retrieval results can't be reproduced #44

Open liuhl-source opened 3 years ago

liuhl-source commented 3 years ago

Thanks for your contribution! I run your code and can not reproduce the performance in your paper. There my results in two different setting: Finetuning Image-text retrieval with "train-itm-flickr-base-8gpu.json"

图片

Finetuning Image-text retrieval with"train-itm-flickr-base-16gpu-hn.json"

图片

There is a gap between the results and those in your paper. What's the difference between the experiment you do and the code you released? For example, training steps and learning rate. Thanks!

RenShuhuai-Andy commented 3 years ago

I have the same problem. I can reproduce the performance of Image-text retrieval with"train-itm-flickr-base-16gpu-hn.json". But for "train-itm-coco-base-16gpu-hn.json", the result is image , which is worse than the results in your paper. The above results are produced on 8 V100 gpus.