tensorflow / nmt

TensorFlow Neural Machine Translation Tutorial
Apache License 2.0
6.35k stars 1.96k forks source link

Tgt language problem #452

Open hkitna opened 4 years ago

hkitna commented 4 years ago

Hi all,

This is my first time to study the NMT. I want to ask does it has any different in the tgt language with "cn" and "zh"?

Because I have prepared some corpus (but I check that both of them are simplified Chinese), dev and test for training model. Before the training, I have following script "wmt16_en_de.sh" to tokenize, clean, learn shared ape and create vocabulary on my prepared corpus. (I am ready to do en to zh)

But after the training, the output_dev and output_test with bleu 0.0.

Here is my options to run the nmt.

python3 -m nmt.nmt \ --src=en --tgt=zh \ --vocab_prefix=/home/usr/data/vocab.bpe.32000 \ --train_prefix=/home/usr/data/train.tok.clean.bpe.32000 \ --dev_prefix=/home/usr/data/dev2010.tok.bpe.32000 \ --test_prefix=/home/usr/data/test2014.tok.bpe.32000 \ --out_dir=/home/usr/model \ --num_train_steps=12000 \ --steps_per_stats=100 \ --num_layers=2 \ --num_units=128 \ --dropout=0.2 \ --metrics=bleu

Any comment and idea will be very welcome. Thanks all.