bigscience-workshop / Megatron-DeepSpeed

Ongoing research training transformer language models at scale, including: BERT & GPT-2
Other
1.31k stars 213 forks source link

How to run generation? #314

Closed mayank31398 closed 2 years ago

mayank31398 commented 2 years ago

How can I run generation for the models?

Huggingface: I understand that I can run generation from huggingface. Does HF support 175B BLOOM? And does it need 8 GPUs for tensor parallel?

Deepspeed: Can I run generation using a deepspeed checkpoint or do I need to convert it to vanilla megatron/transformers before running generation?

mayank31398 commented 2 years ago

308 is the issue around this. Can't wait for bloom-inference branch to get merged into main

😄