Open nikhiljaiswal opened 3 years ago
Hi @jaspock , thanks for the response. I went through that answer and tried with transformer_wmt_en_de_big, still I am getting the error as architecture do not match. Please help.
I tried using following -
fairseq-train $path_2_data --finetune-from-model $pretrained_model --max-epoch 500 --ddp-backend=legacy_ddp --task translation_multi_simple_epoch --lang-pairs de-en,en-de --arch transformer_wmt_en_de_big --share-decoder-input-output-embed --optimizer adam --adam-betas '(0.9, 0.98)' --lr 0.0005 --lr-scheduler inverse_sqrt --warmup-updates 4000 --warmup-init-lr '1e-07' --label-smoothing 0.1 --criterion label_smoothed_cross_entropy --dropout 0.3 --weight-decay 0.0001 --max-tokens 4000 --update-freq 8
See my new answer.
This issue has been automatically marked as stale. If this issue is still affecting you, please leave any comment (for example, "bump"), and we'll keep it open. We are sorry that we haven't been able to prioritize it yet. If you have any new additional information, please include it with your comment!
Hi,
I want to finetune the m2m model on my dataset which contains en and de in the source and its corresponding de and en in the target language. In other words, I want to perform joint training of en-de and de-en. when I try to finetune what parameters do I need to pass especially for the task an arch? I tried the following but the architecture do not match-
fairseq-train $path_2_data \ --finetune-from-model $pretrained_model \ --encoder-normalize-before --decoder-normalize-before \ --arch transformer --layernorm-embedding \ --task translation_multi_simple_epoch \ --sampling-method "temperature" \ --sampling-temperature 1.5 \ --encoder-langtok "src" \ --decoder-langtok \ --lang-pairs "$lang_pairs" \ --criterion label_smoothed_cross_entropy --label-smoothing 0.2 \ --optimizer adam --adam-eps 1e-06 --adam-betas '(0.9, 0.98)' \ --lr-scheduler inverse_sqrt --lr 3e-05 --warmup-updates 2500 --max-update 40000 \ --dropout 0.3 --attention-dropout 0.1 --weight-decay 0.0 \ --max-tokens 1024 --update-freq 2 \ --save-interval 1 --save-interval-updates 5000 --keep-interval-updates 10 --no-epoch-checkpoints \ --seed 222 --log-format simple --log-interval 2