Closed HXYNODE closed 3 years ago
Sorry, I can not reproduce your errors, you can check the tensors' sizes step by step (5200 is a strange number). The training of the model is not very stable, so the final result of MSVD in the paper is an average of three models' results.
thanks for your tips i try to reproduce the project again using another machine but the same error is reported the error i met before i attempt to debug the evaluate.py according to your suggestions the strange number 5200 occurs as follows: the errors may be caused by the parameters' size inside the bi_lstm so could you show the screenshot pictures just like the same with the second pictures i uploaded when you run the evaluate.py in debug mode. it confused me a lot and i do want to find out the reason. and i argue that the core problem still is that the net structure is incompitable with the msvd checkpoint files. thank you for your kindly and generous help again!
make sure you didn't change the size parameter in utils.opt.py, and run with the following command:
python evaluate.py --dataset=msvd --model=RMN --result_dir=results/msvd_model --use_loc --use_rel --use_func --hidden_size=512 --att_size=1024 --test_batch_size=2 --beam_size=2 --eval_metric=CIDEr
And I tried to output the size as you did, but I didn't find anything wrong:
Note that the hidden size for msvd is 512 as mentioned in the paper.
Ooops... i got it wrong and you found it! the hidden size parameter set in my run command, 1000 for msr-vtt, did not replaced by 512 for msvd. u're so cool & nice. thanks a lot for your timely help and reply! : )
awesome work! when i reproduce the results you report in this repository (i.e. cider metric score is 97.8 on msvd dataset), errors indicating size mismatch for the whole Capmodel occurred as running evaluate.py with your pretrained file results/msvd_model/msvd_best_cider.pth. e. g. Runtime error: Error(s) in loading state_dictionary for CapModel: size mismatch for encoder.bi_lstm1.weight_it_l0: copying a parameters with shape torch.Size([2048,1000]) from checkpoint, the shape in current model is torch.Size([5200,1000]). size mismatch …… size mismatch …… it seems like you have modified the model while don't update the msvd_best_cider.pth. if you do so please let me know and i would appreciate it if you provide the new version PTH file so that i can reproduce the results you report in this repository. by the way why the final high results was not published in the paper? thanks!