Question about the scores?

Stephen-Adams commented 6 years ago

I have ran your code, but got a higher score , And I guess if there are some mistake in my settings, could you help me? Thank you

For example: with vgg19 + s2vt without attention, I got: "CIDEr": 0.381709195850067, "Bleu_4": 0.35092030557193526, "Bleu_3": 0.46800626456106637, "Bleu_2": 0.6047642387263332, "Bleu_1": 0.7574938986755618, "ROUGE_L": 0.5712265574740849, "METEOR": 0.25508078041867904 for the best. But Actually, I didn't change anything important in your code. I split the train_dataset downed from README to 6513/497/2990 for train/val/test. And the training loss is here: model_0, loss: 57.772758 model_10, loss: 44.913509 model_20, loss: 40.874763 model_30, loss: 40.119427 model_40, loss: 37.268291 model_50, loss: 33.424942 model_60, loss: 35.766853 model_70, loss: 34.876366 model_80, loss: 31.450918 model_90, loss: 29.820242 model_100, loss: 29.936274 model_110, loss: 30.059401 model_120, loss: 30.751385 model_130, loss: 28.711311 model_140, loss: 29.971272 model_150, loss: 30.382835 model_160, loss: 28.844414 model_170, loss: 26.373568 model_180, loss: 28.996819 model_190, loss: 27.722120 model_200, loss: 28.414360 model_210, loss: 25.155075 model_220, loss: 27.731709 model_230, loss: 28.479822 model_240, loss: 26.850664 model_250, loss: 26.169445 model_260, loss: 27.791225 model_270, loss: 25.879797 model_280, loss: 24.860294 model_290, loss: 24.067417 model_300, loss: 23.089293 model_310, loss: 24.369297 model_320, loss: 24.594177 model_330, loss: 24.342461 model_340, loss: 24.752075 model_350, loss: 25.322969 model_360, loss: 25.452364 model_370, loss: 22.378075 model_380, loss: 24.766953 model_390, loss: 22.536497 model_400, loss: 21.342590 I only trained the model for 400 epoch, because I find that model around 100 epoch performs better. with "model_100", I got the best score as showed above. I am new to this, and don't know what is wrong... Wish for your help.

xiadingZ commented 6 years ago

This is a normal result

Stephen-Adams commented 6 years ago

This is a normal result

Really? But I find papers that compare their results to s2vt, where the score of s2vt is also lower than which gained by you code. Score shown in their paper maybe "CIDEr": 0.351, "Bleu_4": 0.326, "ROUGE_L": 0.561, "METEOR": 0.255 for example.

Stephen-Adams commented 6 years ago

This is a normal result

Where CIDEr and Bleu_4 are obviously lower in those papers. 0.0

xiadingZ / video-caption.pytorch

Question about the scores? #19