hunkim / word-rnn-tensorflow

Multi-layer Recurrent Neural Networks (LSTM, RNN) for word-level language models in Python using TensorFlow.
MIT License
1.3k stars 493 forks source link

rnn_decoder initial_state #21

Open bwang482 opened 8 years ago

bwang482 commented 8 years ago

Thanks so much for your code hunkim! It is very helpful!

Can I ask a quick question please? Am I right to think: within one batch, every time you feed model.initial_state: state to model, it overrides self.initial_state = cell.zero_state(args.batch_size, tf.float32) (because this zero_state initialisation is also in model def__init__() thus I am not too sure whether it can be overriden )?

Thanks!

sunxiaobiu commented 8 years ago

The value of initial_state is updated after processing each batch of words: outputs, last_state = seq2seq.rnn_decoder(inputs, self.initial_state, cell, loop_function=loop if infer else None, scope='rnnlm') the zero_state model definit() is to initiliaze the state of the memory. the last_state hold the state of the memory after each batch of words.

Songweiping commented 7 years ago

Hi @sunxiaobiu @hunkim, I am confused with initial_state, too. After a batch, you feed feed[c]=state[i].c, feed[h]=state[i].h to model. Does this mean using last state of last batch as initial state of current batch? In outputs,last_state = seq2seq.rnn_decoder(inputs, self.initial_state, cell, loop_function=loop if infer else None, scope='rnnlm', self.initial_state would not be zero_state after first batch, am I right? Thanks!