Hello i have downloaded the many to many mbart50 and i want to test it in en-fr with data from wmt.
It did not work and I keep having the same word generated instead of a good translation. Do you know why? Are the model not pretrained ? Maybe i have not understand it right.
🐛 Bug
Hello i have downloaded the many to many mbart50 and i want to test it in en-fr with data from wmt. It did not work and I keep having the same word generated instead of a good translation. Do you know why? Are the model not pretrained ? Maybe i have not understand it right.
Here is a file showing what I get:
To Reproduce
What i did:
First downloaded with sacrebleu dataset wmt en-fr
Python /path/to/fairseq/examples/multilingual/data_scripts/binarize.py
Export path_2_data=$work_dir/databin Export model=$work_dir/model.pt Export langs="ar_AR,....,sl_SI" Export source_lang="en_XX" Export target_lang="fr_XX"
Fairseq-generate $path_2_data \ --path $model \ --task translation_from_pretrained_bart \ --gen-subset test \ -s en_XX -t fr_XX \ --sacrebleu --remove-bpe 'sentencepiece' \ --batch-size 32 \ --encoder-langtok "src" \ --decoder-langtok \ --langs $langs
Environment
pip
, source): from zip + pip install --editable .