Open jxchen01 opened 8 years ago
When :backward()
is called on the network (e.g. those LSTM modules), grad_params
is updated, since the gradient tensors in the modules point to some sub-array of grad_params
. This was done by line 178:
params, grad_params = model_utils.combine_all_parameters(protos.rnn)
I really appreciate the nice implementation of RNN. I came across a question when I am trying to understand the training procedure.
In line the function feval(x), the variable "grad_params" are initialized as zeros. After that, the only operation is grad_params:clamp(). How is this variable updated during the training?
Thanks!