I am running the Unsupervised machine translation from a pretrained cross lingual langauge model in google colab. The language model is succesfully trained, while the MT training breaks with the following error:
Illegal division by zero at /content/XLM/src/evaluation/multi-bleu.perl line 154, line 10.
WARNING - 04/22/20 10:51:57 - 0:02:45 - Impossible to parse BLEU score!
By looking at the hypothesis files (hyp0.?-?.valid.txt, hyp0.?-?.test.txt ... in dumped/unsupMT?-?/???????/hypotheses) which are supposed to contain the translations produced by the mt at that time, I noticed that they are all empty, while the reference files (ref.?-?.valid.txt, ref.?-?.test.txt) which are supposed to contain the target translations, are not empty.
This is consistent with the error because lines 153-154-155 of the multi-bleu.perl file contain these :
if ($length_translation<$length_reference) {
$brevity_penalty = exp(1-$length_reference/$length_translation);
}
So the question is why the hypothesis files are empty.
I've done some digging in train.py and src/trainer.py with no luck.
I've already solved my problem. The problem was that I chose a bad epoch_size. To make it simple I divided the number of examples by the batch_size.
And I got excellent scores.
I use my own data.
I am running the Unsupervised machine translation from a pretrained cross lingual langauge model in google colab. The language model is succesfully trained, while the MT training breaks with the following error:
Illegal division by zero at /content/XLM/src/evaluation/multi-bleu.perl line 154, line 10.
WARNING - 04/22/20 10:51:57 - 0:02:45 - Impossible to parse BLEU score!
By looking at the hypothesis files (hyp0.?-?.valid.txt, hyp0.?-?.test.txt ... in dumped/unsupMT?-?/???????/hypotheses) which are supposed to contain the translations produced by the mt at that time, I noticed that they are all empty, while the reference files (ref.?-?.valid.txt, ref.?-?.test.txt) which are supposed to contain the target translations, are not empty.
This is consistent with the error because lines 153-154-155 of the multi-bleu.perl file contain these :
if ($length_translation<$length_reference) { $brevity_penalty = exp(1-$length_reference/$length_translation); }
So the question is why the hypothesis files are empty. I've done some digging in train.py and src/trainer.py with no luck.