hannlp / SimpleNMT

A simple and readable neural machine translation system
MIT License
24 stars 1 forks source link

复现 #2

Open kkeleve opened 2 years ago

kkeleve commented 2 years ago

您好,能否说下您在fairseq训练时的超参,我在WMT100w或30w数据集上的最好结果是21,我想看下我的超参哪里有问题

hannlp commented 2 years ago

你好,这是我当时用的超参,是在README中的中英新闻语料中跑的

CUDA_VISIBLE_DEVICES=0,1 fairseq-train ~/datasets/news-v15/data-bin \
    --arch transformer --source-lang zh --target-lang en  \
    --optimizer adam  --lr 0.001 --adam-betas '(0.9, 0.98)' \
    --lr-scheduler inverse_sqrt --max-tokens 4096  --dropout 0.1 \
    --criterion label_smoothed_cross_entropy  --label-smoothing 0.1 \
    --max-update 100000  --warmup-updates 4000 --warmup-init-lr '1e-07' \
    --keep-last-epochs 10 --save-dir ~/Experiment/news-v15/fairseq