microsoft / DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
https://www.deepspeed.ai/
Apache License 2.0
34.51k stars 4.03k forks source link

[REQUEST] An example of Hybird Parallelism #2906

Open KimmiShi opened 1 year ago

KimmiShi commented 1 year ago

Does DeepSpeed support Hybrid Parallelism? e.g. data parallel + pipeline_parallel + tensor_parallel. Can you show me an example how to use these parallelisms together?

tjruwase commented 1 year ago

@ashe-shi, thanks for your question. Yes, deepspeed supports combination of those three forms of parallelism, a.k.a., 3D parallelism. Please see the following

https://www.microsoft.com/en-us/research/blog/deepspeed-extreme-scale-model-training-for-everyone/ https://huggingface.co/blog/bloom-megatron-deepspeed#megatron-deepspeed

jacklanda commented 1 year ago

Does DeepSpeed leverage the 3D parallelism for huggingface models (e.g., GPT-J, LLaMA) fine-tuning? May I ask anybody know how to simply implement this using DeepSpeed? Thanks!