tgc1997 / RMN

IJCAI2020: Learning to Discretely Compose Reasoning Module Networks for Video Captioning
79 stars 12 forks source link

the mismatch error happened when using the pretarined model you provide. #15

Closed HXYNODE closed 3 years ago

HXYNODE commented 3 years ago

awesome work! when i reproduce the results you report in this repository (i.e. cider metric score is 97.8 on msvd dataset), errors indicating size mismatch for the whole Capmodel occurred as running evaluate.py with your pretrained file results/msvd_model/msvd_best_cider.pth. e. g. Runtime error: Error(s) in loading state_dictionary for CapModel: size mismatch for encoder.bi_lstm1.weight_it_l0: copying a parameters with shape torch.Size([2048,1000]) from checkpoint, the shape in current model is torch.Size([5200,1000]). size mismatch …… size mismatch …… it seems like you have modified the model while don't update the msvd_best_cider.pth. if you do so please let me know and i would appreciate it if you provide the new version PTH file so that i can reproduce the results you report in this repository. by the way why the final high results was not published in the paper? thanks!

tgc1997 commented 3 years ago

Sorry, I can not reproduce your errors, you can check the tensors' sizes step by step (5200 is a strange number). The training of the model is not very stable, so the final result of MSVD in the paper is an average of three models' results.

HXYNODE commented 3 years ago

thanks for your tips i try to reproduce the project again using another machine but the same error is reported the error i met before 2021-04-13 13-37-40 的屏幕截图 i attempt to debug the evaluate.py according to your suggestions the strange number 5200 occurs as follows: 2021-04-13 13-33-43 的屏幕截图 the errors may be caused by the parameters' size inside the bi_lstm so could you show the screenshot pictures just like the same with the second pictures i uploaded when you run the evaluate.py in debug mode. it confused me a lot and i do want to find out the reason. and i argue that the core problem still is that the net structure is incompitable with the msvd checkpoint files. thank you for your kindly and generous help again!

tgc1997 commented 3 years ago

make sure you didn't change the size parameter in utils.opt.py, and run with the following command: python evaluate.py --dataset=msvd --model=RMN --result_dir=results/msvd_model --use_loc --use_rel --use_func --hidden_size=512 --att_size=1024 --test_batch_size=2 --beam_size=2 --eval_metric=CIDEr

And I tried to output the size as you did, but I didn't find anything wrong: weight_size

tgc1997 commented 3 years ago

Note that the hidden size for msvd is 512 as mentioned in the paper.

HXYNODE commented 3 years ago

Ooops... i got it wrong and you found it! the hidden size parameter set in my run command, 1000 for msr-vtt, did not replaced by 512 for msvd. u're so cool & nice. thanks a lot for your timely help and reply! : )