How to train the translation model?

yapingzhao commented 6 years ago

Hi, I use the following command for model training. mkdir /tmp/nmt_model python -m nmt.nmt \ --src=mn --tgt=zh \ --vocab_prefix=/tmp/nmt_data/vocab \ --train_prefix=/tmp/nmt_data/train \ --dev_prefix=/tmp/nmt_data/test \ --test_prefix=/tmp/nmt_data/test \ --out_dir=/tmp/nmt_model \ --num_train_steps=12000 \ --steps_per_stats=100 \ --num_layers=2 \ --num_units=128 \ --dropout=0.2 \ --metrics=bleu However, the output error message： ValueError:vocab_file '/tmp/nmt_data/vocab.mn 'does not exist. I am newbie of nmt.I do not know what is the command that generates the source language (target language) vocabulary? Looking forward to your advice or answers. Best regards,

yapingzhao

ptamas88 commented 6 years ago

Use this script: build_vocab.zip with the following command: python path_to_script/build_vocab.py --data=path_to_corpus/corpus_name --save_vocab=save_path/vocab_file_name --size=50000 you can change the vocab size to anything you want, it will create a vocab with the first n most common words in the corpus and add the tags

yapingzhao commented 6 years ago

thank you very much!

kadlaon commented 5 years ago

Hi @yapingzhao .. were you able to solve this error? I am getting an Assertion Error if I use the build_vocab.py to generate the vocab

tensorflow / nmt

How to train the translation model? #299