sherjilozair / char-rnn-tensorflow

Multi-layer Recurrent Neural Networks (LSTM, RNN) for character-level language models in Python using Tensorflow
MIT License
2.64k stars 960 forks source link

load_preprocessed error - TypeError: unhashable type: 'dict' #3

Closed ronxin closed 8 years ago

ronxin commented 8 years ago

I got this error when trying to train on a preprocessed input.

The error comes from here: https://github.com/sherjilozair/char-rnn-tensorflow/blob/master/utils.py#L42

self.chars is a tuple with (1) a list of chars and (2) a dict of char to integer mapping. I changed the function to just get it running:

def load_preprocessed(self, vocab_file, tensor_file):
    with open(vocab_file) as f:
        self.chars = cPickle.load(f)[0][0]
    print self.chars
    self.vocab_size = len(self.chars)
    self.vocab = dict(zip(self.chars, range(len(self.chars))))
    self.tensor = np.load(tensor_file)
    self.num_batches = self.tensor.size / (self.batch_size * self.seq_length)
sherjilozair commented 8 years ago

self.chars is just a list of characters, and that's what's being stored in the vocab_file. Check this: https://github.com/sherjilozair/char-rnn-tensorflow/blob/master/utils.py#L34

I'm not sure why are you getting this error. Are you sure you haven't changed something which is causing this problem?

ronxin commented 8 years ago

No, I did not change anything. I will let you know shortly how to replicate the error.

ronxin commented 8 years ago

@sherjilozair I found out why.

Normally, during preprocessing, two vocab.pkl are created, one in data_dir and the other in save_dir. The two files are different.

https://github.com/sherjilozair/char-rnn-tensorflow/blob/master/utils.py#L34 https://github.com/sherjilozair/char-rnn-tensorflow/blob/master/train.py#L48

In my experiment I set --data_dir and --save_dir to be the same folder, resulting in one vocab.pkl overwriting the other. When I keyboard interrupt the training and try to restart it, the content in the remaining vocab.pkl (generated by train.py) does not match what is expected (generated by utils.py).

sherjilozair commented 8 years ago

Whoops. Quite a nasty bug. I'll fix this later today. Thanks for figuring this out.