deepseek-ai / DeepSeek-MoE

DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
MIT License
982 stars 48 forks source link

MOE 并行怎么实现的? #31

Open YunxinLi opened 7 months ago

YunxinLi commented 7 months ago

MOE 模型的MLP parallel 是基于deepspeed 怎么实现的呢?

zwd003 commented 6 months ago

是使用自研框架实现的,我们也在vllm中实现了并行推理的代码