wojciechz / learning_to_execute

Learning to Execute
Apache License 2.0
480 stars 115 forks source link

Back Propagation #12

Closed ekinakyurek closed 8 years ago

ekinakyurek commented 8 years ago

In the article you said that ''Our LSTM has two layers and is unrolled for 50 steps in both experiments. It has 400 cells per layer and its parameters are initialized uniformly in [−0.08, 0.08]. ". I don't understand how you back propagate and calculate cross-entropy losses in 50 steps, while the input continues and there is no output. Could you help me?