qfgaohao / pytorch-ssd

MobileNetV1, MobileNetV2, VGG based SSD/SSD-lite implementation in Pytorch 1.0 / Pytorch 0.4. Out-of-box support for retraining on Open Images dataset. ONNX and Caffe2 support. Experiment Ideas like CoordConv.
https://medium.com/@smallfishbigsea/understand-ssd-and-implement-your-own-caa3232cd6ad
MIT License
1.39k stars 530 forks source link

Could u add Nesterov momentum in SGD #190

Open georgekasa opened 1 year ago

georgekasa commented 1 year ago

Hello, as i saw you are using SGD with momentum (default 0.9) could u add a feature to add Nesterov momentum

line: 375-376: optimizer = torch.optim.SGD(params, lr=args.lr, momentum=args.momentum, weight_decay=args.weight_decay,
**nesterov=args.nesterov**)

as Karpathy told in CS231n: Nesterov Momentum is a slightly different version of the momentum update that has recently been gaining popularity. It enjoys stronger theoretical converge guarantees for convex functions and in practice it also consistenly works slightly better than standard momentum.

thank you in advance