tensorflow / nmt

TensorFlow Neural Machine Translation Tutorial
Apache License 2.0
6.39k stars 1.96k forks source link

Question about subword option "bpe" #190

Open lemo2012 opened 6 years ago

lemo2012 commented 6 years ago

Should I covert the "train","dev", "vocab", and "test" dataset to bpe format first if I want to set "--subword_option=bpe"?

oahziur commented 6 years ago

@lemo2012 Yes, you should convert all of them to bpe format.

See this script for how to convert data to the format and generate the vocab file.