Open kayuksel opened 3 years ago
It would be cool to use AGC from the following paper with any optimizer in this package: https://arxiv.org/abs/2102.06171
Here is an implementation for Adaptive Gradient Clipping: https://github.com/vballoli/nfnets-pytorch/blob/main/nfnets/agc.py
Would you like to create PR?
It would be cool to use AGC from the following paper with any optimizer in this package: https://arxiv.org/abs/2102.06171