Choco31415 / Attention_Network_With_Keras

An example attention network with simple dataset.
230 stars 101 forks source link

h, _, c = at_LSTM(context, initial_state=[h, c]) #5

Open z595054650 opened 4 years ago

z595054650 commented 4 years ago

Why not take the output of the previous time step as the input of the next time step, together with context as the input?

Choco31415 commented 3 years ago

While it is technically correct, it is more idiomatic to separate the previous output and true input: context

Does this answer your question? c: