yurayli / image-caption-pytorch

image captioning with flikr8k dataset
MIT License
13 stars 3 forks source link

Tired many times to figure out the error but could not. Can you please give me some idea on this? #1

Open DineshShrestha opened 5 years ago

DineshShrestha commented 5 years ago

RuntimeError: Error(s) in loading state_dict for CaptionModel_B: size mismatch for rnn.embed.weight: copying a param with shape torch.Size([9080, 50]) from checkpoint, the shape in current model is torch.Size([8990, 50]). size mismatch for rnn.linear.weight: copying a param with shape torch.Size([9080, 160]) from checkpoint, the shape in current model is torch.Size([8990, 160]). size mismatch for rnn.linear.bias: copying a param with shape torch.Size([9080]) from checkpoint, the shape in current model is torch.Size([8990]).

yurayli commented 5 years ago

Hi, seems your vocabulary size is not matched to the pretrained model. The vocab_size is 9080 for the pretrained model. You can first try executing the ipynb file to check how the vocab_size is obtained.

DineshShrestha commented 5 years ago

sizemismatch

DineshShrestha commented 5 years ago

How can i change the vocab size so that it is similar to pretrained model?

yurayli commented 5 years ago

If you directly set vocab_size as that of the pretrained model, you can load it. But the tokenization of corpus may be different so that there may be some problems on the predicted results. Your vocab size (8990) may be because your source of data is different. You can try the flickr8K dataset on Kaggle https://www.kaggle.com/shadabhussain/flickr8k, or here https://github.com/jbrownlee/Datasets/releases.

DineshShrestha commented 5 years ago

Thank you for the suggestion. But i have another problem where i got stucked now. error