Wrong paper references for Ranger optimizer variants

The README lists Calibrating the Adaptive Learning Rate to Improve Convergence of ADAM by Tong, Liang, and Bi (2019) as the source paper accompanying the Ranger, RangerQH, and RangerVA codes. However, this paper seems to describe only the addition of softplus to Adam (SAdam) and AMSGrad (SAMSGrad), implemented here: https://github.com/neilliang90/Sadam; it makes no mention of the LookAhead or RAdam techniques. Therefore it makes sense to credit RangerVA to this paper, but Ranger and RangerQH should not use this reference.

The original Ranger optimizer is a combination of the LookAhead, Rectified Adam, and Gradient Centralization papers, and is described in a blog post.

RangerQH uses quasi-hyperbolic momentum introduced by Ma and Yarats (2018) on top of the regular Ranger optimizer, so I believe this should be the reference.

I would propose the following references:

Ranger - New Deep Learning Optimizer, Ranger: Synergistic combination of RAdam + LookAhead for the best of both (2019) [https://medium.com/@lessw/new-deep-learning-optimizer-ranger-synergistic-combination-of-radam-lookahead-for-the-best-of-2dc83f79a48d]
RangerQH - Quasi-hyperbolic momentum and Adam for deep learning (2018) [https://arxiv.org/abs/1810.06801]
RangerVA - Calibrating the Adaptive Learning Rate to Improve Convergence of ADAM (2019) [https://arxiv.org/abs/1908.00700v2]

jettify / pytorch-optimizer

Wrong paper references for Ranger optimizer variants #244