bigcode-project / Megatron-LM

Ongoing research training transformer models at scale
Other
376 stars 49 forks source link

fix distributed optimizer #44

Closed lvwerra closed 1 year ago

lvwerra commented 1 year ago

Fixes issue with using distributed optimizer and args.load.