Closed ronxin closed 8 years ago
self.chars
is just a list of characters, and that's what's being stored in the vocab_file. Check this: https://github.com/sherjilozair/char-rnn-tensorflow/blob/master/utils.py#L34
I'm not sure why are you getting this error. Are you sure you haven't changed something which is causing this problem?
No, I did not change anything. I will let you know shortly how to replicate the error.
@sherjilozair I found out why.
Normally, during preprocessing, two vocab.pkl
are created, one in data_dir
and the other in save_dir
. The two files are different.
https://github.com/sherjilozair/char-rnn-tensorflow/blob/master/utils.py#L34 https://github.com/sherjilozair/char-rnn-tensorflow/blob/master/train.py#L48
In my experiment I set --data_dir
and --save_dir
to be the same folder, resulting in one vocab.pkl
overwriting the other. When I keyboard interrupt the training and try to restart it, the content in the remaining vocab.pkl
(generated by train.py
) does not match what is expected (generated by utils.py
).
Whoops. Quite a nasty bug. I'll fix this later today. Thanks for figuring this out.
I got this error when trying to train on a preprocessed input.
The error comes from here: https://github.com/sherjilozair/char-rnn-tensorflow/blob/master/utils.py#L42
self.chars is a tuple with (1) a list of chars and (2) a dict of char to integer mapping. I changed the function to just get it running: