issues
search
Liuhong99
/
Sophia
The official implementation of “Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training”
MIT License
931
stars
52
forks
source link
updated to Jan 2024 experiments
#47
Closed
Liuhong99
closed
7 months ago