The README lists Calibrating the Adaptive Learning Rate to Improve Convergence of ADAM by Tong, Liang, and Bi (2019) as the source paper accompanying the Ranger, RangerQH, and RangerVA codes. However, this paper seems to describe only the addition of softplus to Adam (SAdam) and AMSGrad (SAMSGrad), implemented here: https://github.com/neilliang90/Sadam; it makes no mention of the LookAhead or RAdam techniques. Therefore it makes sense to credit RangerVA to this paper, but Ranger and RangerQH should not use this reference.
RangerQH uses quasi-hyperbolic momentum introduced by Ma and Yarats (2018) on top of the regular Ranger optimizer, so I believe this should be the reference.
The README lists Calibrating the Adaptive Learning Rate to Improve Convergence of ADAM by Tong, Liang, and Bi (2019) as the source paper accompanying the
Ranger
,RangerQH
, andRangerVA
codes. However, this paper seems to describe only the addition of softplus toAdam
(SAdam) andAMSGrad
(SAMSGrad), implemented here: https://github.com/neilliang90/Sadam; it makes no mention of the LookAhead or RAdam techniques. Therefore it makes sense to creditRangerVA
to this paper, butRanger
andRangerQH
should not use this reference.The original
Ranger
optimizer is a combination of theLookAhead
,Rectified Adam
, and Gradient Centralization papers, and is described in a blog post.RangerQH uses quasi-hyperbolic momentum introduced by Ma and Yarats (2018) on top of the regular Ranger optimizer, so I believe this should be the reference.
I would propose the following references: