microsoft / tutel

Tutel MoE: An Optimized Mixture-of-Experts Implementation
MIT License
694 stars 84 forks source link

Can tutel support Pipeline Parallel? #233

Closed xcwanAndy closed 3 months ago

xcwanAndy commented 3 months ago

Hi all. After walking through the examples, I suppose that tutel currently support data / tensor parallel for its moe layer module. Is it correct? Then, what can I do if I want the entire model training support Pipeline parallel?

Or, can tutel be used concurrently with Megatron or Deepspeed? If so, then I can configure the hybrid parallel following the other two's configuration manner.

xcwanAndy commented 3 months ago

I believe this answer exactly works for me. I'll close this issue now.