Open mratsim opened 6 years ago
@mratsim I'm curious about adding momentum to SGD (largely to avoid doing any actual work in my own Nim projects, ha). Would you want to do it in the same way as PyTorch/Tensorflow? That is, both libraries provide a single "SGD" optimizer with a momentum parameter, where for momentum > 0 momentum is applied, and where momentum == 0 it acts as simple SGD. They then also implement a "Nesterov" boolean, where true applies Nesterov momentum instead of regular momentum. Or do you envision a different implementation?
Currently only stochastic gradient descent is supported, at the very minimum it would be nice to support: