bigscience-workshop / Megatron-DeepSpeed

Ongoing research training transformer language models at scale, including: BERT & GPT-2
Other
1.31k stars 213 forks source link

add OnDevice and remove zero-inference #316

Closed jeffra closed 2 years ago

jeffra commented 2 years ago

For now requires https://github.com/microsoft/DeepSpeed/pull/2083 to run until it's merged on DS side