facebookresearch / fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
MIT License
30.46k stars 6.41k forks source link

How to reproduce these non-autoregressive machine translation results? #3116

Closed speedcell4 closed 3 years ago

speedcell4 commented 3 years ago

Hi @MultiPath. Thank you for providing these training scripts. But I found I can not reproduce the BELU scores.

What I have tried

Prepare dataset

I downloaded the distillation dataset, and preprocess it with

fairseq-preprocess \
 --source-lang en --target-lang de \
 --trainpref wmt14_ende_distill/train.en-de \
 --validpref wmt14_ende_distill/valid.en-de \
 --testpref wmt14_ende_distill/test.en-de \
 --destdir data-bin/wmt14_ende_distill \
 --thresholdtgt 0 --thresholdsrc 0 --workers 20 --joined-dictionary

Train the NAT-CRF model

I tried the following command from this page first,

fairseq-train \
    data-bin/wmt14_ende_distill \
    --save-dir checkpoints \
    --ddp-backend=no_c10d \
    --task translation_lev \
    --criterion nat_loss \
    --arch nacrf_transformer \
    --noise full_mask \
    --share-all-embeddings \
    --optimizer adam --adam-betas '(0.9,0.98)' \
    --lr 0.0005 --lr-scheduler inverse_sqrt \
    --stop-min-lr '1e-09' --warmup-updates 10000 \
    --warmup-init-lr '1e-07' --label-smoothing 0.1 \
    --dropout 0.3 --weight-decay 0.01 \
    --decoder-learned-pos \
    --encoder-learned-pos \
    --pred-length-offset \
    --length-loss-factor 0.1 \
    --word-ins-loss-factor 0.5 \
    --crf-lowrank-approx 32 \
    --crf-beam-approx 64 \
    --apply-bert-init \
    --log-format 'simple' --log-interval 100 \
    --fixed-validation-seed 7 \
    --max-tokens 8000 \
    --save-interval-updates 10000 \
    --max-update 300000

But got an error about "there is no option --stop-min-lr", so I removed it.

Evaluation

fairseq-generate data-bin/wmt14_ende_distill \
 --gen-subset test --task translation_lev \
 --path checkpoints/nat-crf/checkpoint_best.pt \
 --beam 1 --remove-bpe --print-step --batch-size 400
 --results-path .

grep ^T generate-test.txt | cut -f2- > ref.txt
grep ^H generate-test.txt | cut -f3- > sys.txt
fairseq-score --sys sys.txt --ref ref.txt

But the BELU scores are very low,

BLEU4 = 1.81, 30.4/5.4/1.2/0.3 (BP=0.652, ratio=0.701, syslen=45189, reflen=64506)

Could you please tell me which step goes wrong? Thank you~

mily33 commented 3 years ago

You can try running "fairseq-train --help" to see the provided options. In some versions, the option "stop-min-lr" is replaced with "min-lr"

stale[bot] commented 3 years ago

This issue has been automatically marked as stale. If this issue is still affecting you, please leave any comment (for example, "bump"), and we'll keep it open. We are sorry that we haven't been able to prioritize it yet. If you have any new additional information, please include it with your comment!