layumi / Image-Text-Embedding

TOMM2020 Dual-Path Convolutional Image-Text Embedding :feet: https://arxiv.org/abs/1711.05535
MIT License
287 stars 73 forks source link

CUHK results drops compared to reported numbers. #10

Open dinggd opened 5 years ago

dinggd commented 5 years ago

txt-image rank-1:0.384178 mAP:0.354052 Medr:3.000000 txt-image rank-5:0.610136 rank-10:0.703866

which is normally 5-6% lower than reported. I made no change to the parameters, any ideas what could possibly be the reason for the performance drop?

trained on Ubuntu 16.04 LTS Matlab 2015b (preprocessing data on another machine with a higher version for jsondecode support) + cudnn 5.0 with a Titan Xp.

layumi commented 5 years ago

Hi @gddingcs Thank you for your attention on our paper. How about the result in the stage I ?

dinggd commented 5 years ago

Hi @gddingcs Thank you for your attention on our paper. How about the result in the stage I ?

@layumi Seems like stage I also yielded a lower performance. This is the 180 epoch result, is that the desired epoch? txt-image rank-1:0.138077 mAP:0.112866 Medr:21.000000 txt-image rank-5:0.302632 rank-10:0.395062

layumi commented 5 years ago

Yes. It is the 180th epoch. Have you changed any code or meet some error?

dinggd commented 5 years ago

The only problem that I may have encountered is the preprocessing part where as instructed in the readme resize_image is done separately leading the imdb.rgbMean to be null, which I revised accordingly. Other than that, I did not change any code and the training went smoothly without any error.

af00731 commented 5 years ago

Hi, @gddingcs Did you manage to get similar results as paper?

layumi commented 5 years ago

Hi @gddingcs , @af00731

Recently I did the code review. Have you try to change the https://github.com/layumi/Image-Text-Embedding/blob/master/train_cuhk_Rankloss_shift.m#L31 to 1:1:1?
1:1:1 may be good.