tensorflow / nmt

TensorFlow Neural Machine Translation Tutorial
Apache License 2.0
6.37k stars 1.96k forks source link

English-german Test and Dev Bleu metrics show 0.0 #292

Open mohammedayub44 opened 6 years ago

mohammedayub44 commented 6 years ago

My test and dev bleu metric shows as 0.0. I replicated the whole setup on windows machine. Using the pre-trained 8 layer model. Not sure if there is something missing in the setup doc or I'm passing the wrong files to the training argument. I followed this link. https://www.tensorflow.org/tutorials/seq2seq

This is the command I'm running: python -m nmt.nmt --src=de --tgt=en --hparams_path=standard_hparams\wmt16_gnmt_8_layer.json --out_dir=D:\Projects\nmt_model\German_English_test --vocab_prefix=D:\Projects\nmt_data\german_english\vocab.bpe.32000 --train_prefix=D:\Projects\nmt_data\german_english\train.tok.clean.bpe.32000 --dev_prefix=D:\Projects\nmt_data\german_english\newstest2014.tok.bpe.32000 --test_prefix=D:\Projects\nmt_data\german_english\newstest2016.tok.bpe.32000

Attached are my dev and test files (Most of them resulting as "UNK" for some weird reason)

output_dev.txt output_test.txt

Here is are my vocab file: (both are basically the same file duplicated as suggested in the setup link. I had to remove duplicates as I was getting hash table error and open and re-save them as UTF-8 because of decode errors from python) vocab.bpe.32000.de.txt vocab.bpe.32000.en.txt

Here is the log file: log_1522787979.txt

Any help appreciated. Thanks !

tuvuumass commented 6 years ago

Using a smaller learning rate might help.