deepseek-ai / DeepSeek-MoE

DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
MIT License
982 stars 48 forks source link

专家并行是怎么配置的? 有配置代码吗 #38

Open ninglonglong opened 2 months ago