Closed JiyangZhang closed 3 years ago
The checkpoint is perfectly fine as its embedding weight size (torch.Size([50005, 768])
) is correct. The issue is you are missing --langs
flag which adds 3 language tokens and 1 mask token is also added. So, then embedding size becomes 50001 + 4 = 50005.
As I see, you are using -task translation
which is the main reason for not working. Please, read our scripts carefully. You can only use the following tasks with PLBART.
Hi,
Thanks for the quick response! I used the same version of fairseq and packages listed in the requirements.txt. However, I had this error when I tried translation_without_lang_token:
fairseq-train: error: argument --task: invalid choice: 'translation_without_lang_token' (choose from 'translation', 'multilingual_translation', 'semisupervised_translation', 'language_modeling', 'audio_pretraining', 'translation_multi_simple_epoch', 'multilingual_masked_lm', 'legacy_masked_lm', 'translation_from_pretrained_xlm', 'cross_lingual_lm', 'sentence_ranking', 'masked_lm', 'translation_from_pretrained_bart', 'denoising', 'multilingual_denoising', 'translation_lev', 'sentence_prediction', 'dummy_lm', 'dummy_masked_lm', 'dummy_mt')
translation_without_lang_token
is our defined task. Setting --user-dir $USER_DIR
should not raise the above-mentioned error. I am not sure why you are facing this error.
Thanks! That makes things clear!
Could I ask another question? If I want to use PLBART to do a translation task but the generated text should include some special self-defined tokens. What I should do are:
Is it correct? please correct me if there is something wrong.
Thank you again!
If you are done, please close the issue.
Thanks!
Hi, thanks for your great work!
When I tried to load the pre-trained checkpoints and fine tune, I came across the size mismatch problem. It seems that the dict.txt you provided does not match the checkpoints.
Here is the error message:
This is the script I used to get the checkpoints: https://github.com/wasiahmad/PLBART/blob/main/pretrain/download.sh
This is the dict.txt I used: https://github.com/wasiahmad/PLBART/blob/main/sentencepiece/dict.txt
Here is the command I used to fine tune:
fairseq-train $PATH_2_DATA \ --user-dir $USER_DIR --truncate-source \ --arch mbart_base --layernorm-embedding \ --task translation \ --source-lang $SOURCE --target-lang $TARGET \ --criterion label_smoothed_cross_entropy --label-smoothing 0.1 \ --batch-size $BATCH_SIZE --update-freq $UPDATE_FREQ --max-epoch 30 \ --optimizer adam --adam-eps 1e-06 --adam-betas '(0.9, 0.98)' \ --lr-scheduler polynomial_decay --lr 5e-05 --min-lr -1 \ --warmup-updates 500 --max-update 100000 \ --dropout 0.1 --attention-dropout 0.1 --weight-decay 0.0 \ --seed 1234 --log-format json --log-interval 100 \ ${restore} \ --eval-bleu --eval-bleu-detok space --eval-tokenized-bleu \ --eval-bleu-remove-bpe sentencepiece --eval-bleu-args '{"beam": 5}' \ --best-checkpoint-metric bleu --maximize-best-checkpoint-metric \ --no-epoch-checkpoints --patience 5 \ --ddp-backend no_c10d --save-dir $SAVE_DIR 2>&1 | tee ${OUTPUT_FILE};