Liuhong99 / Sophia

The official implementation of “Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training”
MIT License
931 stars 52 forks source link

updated to Jan 2024 experiments #47

Closed Liuhong99 closed 7 months ago