Open bratao opened 1 year ago
This new article(Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training) proposes a new optimizer that says that can improve LLM training up to 2x.
https://arxiv.org/abs/2305.14342
This new article(Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training) proposes a new optimizer that says that can improve LLM training up to 2x.
https://arxiv.org/abs/2305.14342