Liuhong99 / Sophia

The official implementation of “Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training”
MIT License
938 stars 52 forks source link

Issue 37 : When running without ddp get_batch didnt work because it requires ddp… #41

Closed attesaarela closed 1 year ago

attesaarela commented 1 year ago

Quick fix for issue 37 :

Running train_sophiag.py didn't work without DDP because get_batch uses ddp_rank also when running locally. With this fix, ddp_rank is just set to zero when running without DDP