When I tried to reproduce the results following the instructions in the README, I failed on the generation after successful training. I ran the following commands, where trained_weights is the directory to save models.
Traceback (most recent call last):
File "generator.py", line 71, in <module>
m.load_state_dict(cpt)
File "/hb/software/apps/python/gnu-3.6.5GPU/lib/python3.6/site-packages/torch/nn/modules/module.py", line 719, in load_state_dict
self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for model:
size mismatch for emb.weight: copying a param of torch.Size([1914, 500]) from checkpoint, where the shape is torch.Size([11738, 500]) in current model.
size mismatch for out.weight: copying a param of torch.Size([1914, 1000]) from checkpoint, where the shape is torch.Size([11738, 1000]) in current model.
size mismatch for out.bias: copying a param of torch.Size([1914]) from checkpoint, where the shape is torch.Size([11738]) in current model.
size mismatch for le.seqenc.lemb.weight: copying a param of torch.Size([6173, 500]) from checkpoint, where the shape is torch.Size([53343, 500]) in current model.
It seems like the vocab size of the testing set is changed. Actually, I think it's the same issue reported here, but I can't get the exact solution from these comments. Could you please provide the exact command line arguments to figure it out? Thanks!
When I tried to reproduce the results following the instructions in the README, I failed on the generation after successful training. I ran the following commands, where trained_weights is the directory to save models.
Then I got the error as shown below,
It seems like the vocab size of the testing set is changed. Actually, I think it's the same issue reported here, but I can't get the exact solution from these comments. Could you please provide the exact command line arguments to figure it out? Thanks!