RayeRen / multilingual-kd-pytorch

ICLR2019, Multilingual Neural Machine Translation with Knowledge Distillation

Other

70 stars 18 forks source link

readme

Multilingual NMT with Knowledge Distillation on Fairseq

The implementation of Multilingual Neural Machine Translation with Knowledge Distillation [ICLR2019] (Xu Tan, Yi Ren, Di He, Tao Qin, Zhou Zhao, Tie-Yan Liu)

This code is based on Fairseq

Preparation

pip install -r requirements.txt
cd data/iwslt/raw; bash prepare-iwslt14.sh
python setup.py install

Run Multilingual NMT with Knowledge Distillation

Train Experts

Run data_dir=iwslt exp_name=train_expert_LNG1 targets="LNG1" hparams=" --save-output --share-all-embeddings" bash runs/train.sh.
Replace LNG1 with other languages to train all the experts(LNG2, LNG3, ...).
Topk output binary files will be produced after steps 1 and 2 in $data/data-bin

Train Multilingual Student

Run exp_name=train_kd_multilingual targets="LNG1,LNG2 ...(filling with all languages)" hparams=" --share-all-embeddings" bash runs/train_distill.sh to train the KD multilingual model. BLEU scores will be printed to console every 3 epochs.