karpathy / char-rnn

Multi-layer Recurrent Neural Networks (LSTM, GRU, RNN) for character-level language models in Torch
11.59k stars 2.58k forks source link

Question About train.lua #148

Open jxchen01 opened 8 years ago

jxchen01 commented 8 years ago

I really appreciate the nice implementation of RNN. I came across a question when I am trying to understand the training procedure.

In line the function feval(x), the variable "grad_params" are initialized as zeros. After that, the only operation is grad_params:clamp(). How is this variable updated during the training?

Thanks!

MTSranger commented 8 years ago

When :backward() is called on the network (e.g. those LSTM modules), grad_params is updated, since the gradient tensors in the modules point to some sub-array of grad_params. This was done by line 178:

params, grad_params = model_utils.combine_all_parameters(protos.rnn)