jettify / pytorch-optimizer

torch-optimizer -- collection of optimizers for Pytorch
Apache License 2.0
3.02k stars 297 forks source link

Include the Sophia Optimizer #501

Open bratao opened 1 year ago

bratao commented 1 year ago

This new article(Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training) proposes a new optimizer that says that can improve LLM training up to 2x.

https://arxiv.org/abs/2305.14342