Closed tjshu closed 2 years ago
the question is in --update-freq 1 train in --update-freq 8 can get normal BLUE
the question is in --update-freq 1 train in --update-freq 8 can get normal BLUE
May I ask why you use 44000 as the vocabulary size rather than 37000 in "attention is all you need"?
the question is in --update-freq 1 train in --update-freq 8 can get normal BLUE
May I ask why you use 44000 as the vocabulary size rather than 37000 in "attention is all you need"? I try two vocabulary size,44000 better then 37000 a little 37000 vocabulary size also can achieve the BLUE of "attention is all you need"
❓ Questions and Help
Before asking:
What is your question?
as title,the question is the same as #4477
Code
Binarize the dataset
fairseq-preprocess \ --source-lang en --target-lang de \ --trainpref $TEXT/train --validpref $TEXT/valid --testpref $TEXT/test \ --destdir data-bin/wmt14_en_de --thresholdtgt 0 --thresholdsrc 0 --nwordssrc 44000 --nwordstgt 44000\ --joined-dictionary --workers 20
PYTHONIOENCODING=utf-8 fairseq-train \ data-bin/wmt14_en_de \ --arch transformer_wmt_en_de --share-all-embeddings \ --optimizer adam --adam-betas '(0.9, 0.98)' --clip-norm 0.0 \ --lr 5e-4 --lr-scheduler inverse_sqrt --warmup-updates 4000 \ --dropout 0.3 --weight-decay 0.0001 \ --criterion label_smoothed_cross_entropy --label-smoothing 0.1 \ --max-tokens 4096 \ --max-tokens-valid 4096 \ --update-freq 1 \ --eval-bleu \ --eval-bleu-args '{"beam": 5, "max_len_a": 1.2, "max_len_b": 10}' \ --eval-bleu-detok moses \ --eval-bleu-remove-bpe \ --eval-bleu-print-samples \ --best-checkpoint-metric bleu --maximize-best-checkpoint-metric \ --save-dir checkpoints/wmt14_en_de/transformer/ckpt \ --log-format json \ --keep-last-epochs 5 \ --max-epoch 30 \ --fp16 \
Evaluate fairseq-generate data-bin/wmt14_en_de --path checkpoints/wmt14_en_de/transformer/ckpt/checkpoint_best.pt --batch-size 128 --beam 5 --remove-bpe --scoring sacrebleu
What have you tried?
add --scoring sacrebleu and change --nwordssrc 44000 --nwordstgt 44000 ->--nwordssrc 32768 --nwordstgt 32768 #3807 and try compound_split_bleu.sh But there are still huge differences between valid(27.21) and evaluate(24.37(add --scoring sacrebleu))
What's your environment?
pip
, source):source