MOE 并行怎么实现的？

deepseek-ai / DeepSeek-MoE

DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models

MIT License

982 stars 48 forks source link

Open YunxinLi opened 7 months ago

YunxinLi commented 7 months ago

MOE 模型的MLP parallel 是基于deepspeed 怎么实现的呢？

zwd003 commented 6 months ago

是使用自研框架实现的，我们也在vllm中实现了并行推理的代码