Liuhong99 / Sophia

The official implementation of “Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training”
MIT License
937 stars 54 forks source link

Fix "NameError: name 'rho' is not defined" #8

Closed nalzok closed 1 year ago