FlagOpen / FlagScale

FlagScale is a large model toolkit based on open-sourced projects.
Other
132 stars 40 forks source link

Add async save_checkpoint impl #152

Closed Caozhou1995 closed 2 months ago

Caozhou1995 commented 2 months ago

This PR adds FlagScale's Flash Checkpoint feature. [TBD]

Caozhou1995 commented 2 months ago

This PR is to be closed because in the future we will focus on megatron's dist ckpt functionality and make related optimizations based on it. At the same time, we used DLRover as backup, see PR for details: https://github.com/FlagOpen/FlagScale/pull/155