arcee-ai / mergekit

Tools for merging pretrained large language models.
GNU Lesser General Public License v3.0
4.88k stars 446 forks source link

Support for fine-grained experts in MoE models #363

Open misdelivery opened 4 months ago

misdelivery commented 4 months ago

Are there any plans to support fine-grained experts in the future?

Fine-grained experts is a technique adopted in projects like Qwen MoE and DeepSeek MoE, and has shown promising results. This approach involves partitioning a single FFN into several segments to create multiple experts, allowing for a larger number of experts without increasing the overall parameter count. Qwen MoE DeepSeek MoE