ictnlp / HMT

Source code for ICLR 2023 spotlight paper "Hidden Markov Transformer for Simultaneous Machine Translation"
MIT License
21 stars 2 forks source link

Reproduce the results of the paper, but there may be something wrong with my pre-processing steps. #1

Open lingxiaoxue opened 9 months ago

lingxiaoxue commented 9 months ago

I used L=3, K=6 WMT2015 de-en Transformer-Base to reproduce the results of the paper, but the bleu value was nearly 1.64 lower than the original paper. image

As shown in the figure above, the result in the paper is bleu 29.29, but the reproduced result is bleu 27.65. The training and inference steps are the same as those on github, so I think there may be something wrong with my pre-processing steps.

Refer to prepare-wmt14en2de.shhttps://github.com/ictnlp/HMT/blob/main/examples/translation/prepare-wmt14en2de.sh), change it to wmt15, and delete lines 114-118; As the paper, I use newstest2013 (3000 pairs) as the validation set and newstest2015 (2169 pairs) as the test set. image

and BPE_TOKENS =32000;The rest of the steps remain the same. python $BPEROOT/learn_bpe.py -s $BPE_TOKENS < $TRAIN > $BPE_CODE

image

Finally, perform length filtering: perl $CLEAN -ratio 1.5 $tmp/bpe.train $src $tgt $prep/train 1 250 perl $CLEAN -ratio 1.5 $tmp/bpe.valid $src $tgt $prep/valid 1 250

Before pre-processing: image After pre-processing: image

Can you help analyze the problem, or provide a pre-processing script?