hunkim / word-rnn-tensorflow

Multi-layer Recurrent Neural Networks (LSTM, RNN) for word-level language models in Python using TensorFlow.
MIT License
1.3k stars 494 forks source link

Add an --input_encoding argument to train.py #56

Closed luser closed 7 years ago

luser commented 7 years ago

I was trying to use these scripts with a dataset I downloaded that was in UTF-8. Python 3 uses the system character encoding by default, which on an en-US Windows is CP1252, so it failed. This patch adds an --input_encoding argument to train.py so I could simply run train.py --input_encoding=utf8.

I also fixed the help text for --gpu_mem, which had bare percent signs which broke things when running train.py --help.

Thanks for publishing this repository, it's really helpful!

hunkim commented 7 years ago

Looks good. Thanks!