Closed coomt closed 7 years ago
Hi, this program is actually designed to process any text (which is one advantage of Char RNN). I have used it on some Chinese text before and the result is pretty fun :)
You just need to specify the encoding of the text using the --encoding argument. This is also noted in the Readme.
Note: train.py assume the data file is using utf-8 encoding by default, use --encoding=your-encoding to specify the encoding if your data file cannot be decoded using utf-8.
Chinese text is usually using utf-8 or gb2312.
It seems that this program is designed for processing English text, but I have some Chinese text to train. How can I modify it?