Closed Bachstelze closed 3 years ago
Hi,
Note the above log:
size mismatch for encoder.embed_positions.weight: copying a param with shape torch.Size([302, 1024]) from checkpoint, the shape in current model is torch.Size([258, 1024]). size mismatch for decoder.embed_positions.weight: copying a param with shape torch.Size([302, 1024]) from checkpoint, the shape in current model is torch.Size([258, 1024]).
Which means you should set --max-source-positions 300 --max-target-positions 300
during training
Do i have to set them also during generation? Because i get this error after fine tuning:
Traceback (most recent call last):
File "/media/kalle/Sprachdaten/mRASP/train_environment/bin/fairseq-generate", line 8, in <module>
sys.exit(cli_main())
File "/media/kalle/Sprachdaten/mRASP/train_environment/lib/python3.8/site-packages/fairseq_cli/generate.py", line 199, in cli_main
main(args)
File "/media/kalle/Sprachdaten/mRASP/train_environment/lib/python3.8/site-packages/fairseq_cli/generate.py", line 104, in main
hypos = task.inference_step(generator, models, sample, prefix_tokens)
File "/media/kalle/Sprachdaten/mRASP/train_environment/lib/python3.8/site-packages/fairseq/tasks/fairseq_task.py", line 265, in inference_step
return generator.generate(models, sample, prefix_tokens=prefix_tokens)
File "/media/kalle/Sprachdaten/mRASP/train_environment/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 15, in decorate_context
return func(*args, **kwargs)
File "/media/kalle/Sprachdaten/mRASP/train_environment/lib/python3.8/site-packages/fairseq/sequence_generator.py", line 113, in generate
return self._generate(model, sample, **kwargs)
File "/media/kalle/Sprachdaten/mRASP/train_environment/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 15, in decorate_context
return func(*args, **kwargs)
File "/media/kalle/Sprachdaten/mRASP/train_environment/lib/python3.8/site-packages/fairseq/sequence_generator.py", line 376, in _generate
cand_scores, cand_indices, cand_beams = self.search.step(
File "/media/kalle/Sprachdaten/mRASP/train_environment/lib/python3.8/site-packages/fairseq/search.py", line 81, in step
torch.div(self.indices_buf, vocab_size, out=self.beams_buf)
RuntimeError: Integer division of tensors using div or / is no longer supported, and in a future release div will perform true division as in Python 3. Use true_divide or floor_divide (// in Python) instead.
``
It seems that it is not the model loading problem. From the log you post, it might due to the Python3.8 issue. You may check whether there is empty line in the source you generate from. Or you may use Python<3.8 to check whether the problem is raised by Python3.8 .
I can't load the pretrained
32-lang-pairs-RAS-ckp
- model with the tagged fairseq version 0.9.0:The model states itself as
transformer_vaswani_wmt_en_de_big
. Have there been changes to the architecture? Isn't the architecture compatible due to https://github.com/pytorch/fairseq/issues/2664 ?Thanks for your promissing work!