Closed liiicon closed 9 months ago
optimizer = Adan(pg0,lr = 1e-3, betas=(0.98, 0.92, 0.99), eps=1.0e-08, weight_decay=0.02), Black is SGD,pink is Adan.
请问sgd的lr是多少? 其次,可以设置beta3为0.999,以及增大lr以及warmup steps进行尝试。
sgd的lr是0.01。我会尝试您的建议,十分感谢您的帮助!
optimizer = Adan(pg0,lr = 1e-3, betas=(0.98, 0.92, 0.99), eps=1.0e-08, weight_decay=0.02), Black is SGD,pink is Adan.