Maintaining state between predictions

Marsan-Ma-zz / tf_chatbot_seq2seq_antilm

Seq2seq chatbot with attention and anti-language model to suppress generic response, option for further improve by deep reinforcement learning.

362 stars 162 forks source link

Maintaining state between predictions #14

Open gidim opened 7 years ago

gidim commented 7 years ago

Hi, Any plans on adding state to the encoder/decoder? the idea is that you realistically you want to predict (answer_n | question_n, answer_n-1,question_n-1 ...) and not one by one as the original translation model is doing.

Marsan-Ma-zz commented 7 years ago

That's an interesting idea, how do we make this model remembering some facts from previous dialogue? I guess neural Turing machine might be a good candidate.

gidim commented 7 years ago

There's many ways to maintain some memory of the sequence of inputs but the easiest is just to keep the LSTM/GRU state between calls to model.step(), and not reset it.