PaddlePaddle / Paddle

PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
http://www.paddlepaddle.org/
Apache License 2.0
22.24k stars 5.58k forks source link

实现AdaScale SGD #27088

Closed guru4elephant closed 3 years ago

guru4elephant commented 4 years ago

ICML 2020 paper AdaScale SGD: A User-Friendly Algorithm for Distributed Training

在分布式训练场景下,多机多卡同步训练时总batch通常会比较大,并且随着节点数的变化,如果不精细的调整学习率会影响最终的收敛效果,ICML 2020提出的AdaScale SGD针对这个问题有很好的解法,Paddle需要提供给用户一个开箱即用的配置

danleifeng commented 4 years ago

收到,我们评估一下,非常感谢!

JZ-LIANG commented 4 years ago

收到,将排期开发