sherjilozair / char-rnn-tensorflow

Multi-layer Recurrent Neural Networks (LSTM, RNN) for character-level language models in Python using Tensorflow
MIT License
2.64k stars 964 forks source link

Feeding same _initial_state_ to all layers #115

Open vsuarezpaniagua opened 6 years ago

vsuarezpaniagua commented 6 years ago

In the training phase the _self.initialstate is used as _cell.zerostate and _laststate of the last layer is kept:

self.initial_state = cell.zero_state(args.batch_size, tf.float32)
outputs, last_state = legacy_seq2seq.rnn_decoder(inputs, self.initial_state, cell, loop_function=loop if not training else None, scope='rnnlm')
self.final_state = last_state

However, in the testing phase (def sample()) it seems that all the layers are fed just with the state of the last layer of the previous step, _self.finalstate, as:

feed = {self.input_data: x, self.initial_state: state}
[probs, state] = sess.run([self.probs, self.final_state], feed)

If I'm not wrong I think all the states of each layer must be kept and then fed them in their corresponding layer for the following steps, not feeding the last one to all the layers.