Closed darsh10 closed 4 years ago
Can you please include your code?
airseq-preprocess --source-lang es --target-lang en --trainpref $TEXT/train --validpref $TEXT/valid --testpref $TEXT/test --destdir data-bin/blah.es-en
fairseq-train data-bin/blah.es-en/ --lr 0.25 --clip-norm 0.1 --dropout 0.2 --max-tokens 4000 --arch fconv_iwslt_de_en --save-dir checkpoints/fconv--max-source-positions 44 --max-target-positions 44 --skip-invalid-size-inputs-valid-test --max-sentences 24 --max-sentences-valid 24 --eval-bleu
fairseq-generate data-bin/blah.es-en --path checkpoints/fconv/checkpoint_best.pt --batch-size 128 --beam 5 --skip-invalid-size-inputs-valid-test
Seems like a duplicate of #1903
Well, that is unclear. Since the reverse direction. en-es works perfectly. (With the identical commands , with obvious changes)
Seems this is caused by a mismatch between the state_dict(s) in the pre-trained FConv models and the initialized one during inference.
So in fairseq/fairseq/modules/linearized_convolution.py
, _linearized_weight
is initialized to None, which will not be included in the model's state_dict (see task.build_model(...)
in checkpoint_utils.py). That probably causes the "lost key error" when loading from a saved checkpoint.
A simple fix should be setting "strict=False" when calling load_state_dict, but any better solutions?
Thank you very much for this great fix @jiangfeng1124 👍
Fixed by b2ee110c853c5effdd8d21f50a8437485bafb285
This is the error that I get on using the fairseq-generate command on a trained model
load_state_dict self.class.name, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for FConvModel: Unexpected key(s) in state_dict: "decoder.convolutions.0._linearized_weight", "decoder.convolutions.1._linearized_weight", "decoder.convolutions.2._linearized_weight".
Surprisingly, the translation and generate work for en-es pair and not reverse.