What Gradient Descent Method clstm is using? SGD? AdaGrad? NAG? RMSProp? Adam?
I want to increase the speed of the learning.
If clstm is not using adaptive learning rate algorithm, I also have to ask that this method can change the learning rate dynamically to implement adaptive learning rate algorithm:
What Gradient Descent Method clstm is using? SGD? AdaGrad? NAG? RMSProp? Adam? I want to increase the speed of the learning. If clstm is not using adaptive learning rate algorithm, I also have to ask that this method can change the learning rate dynamically to implement adaptive learning rate algorithm: