PKU-YuanGroup / MoE-LLaVA

Mixture-of-Experts for Large Vision-Language Models
https://arxiv.org/abs/2401.15947
Apache License 2.0
1.9k stars 121 forks source link

[Discussion] What is the expert relationship between different layers with the same index? If not, what is the role of figures 4, 5 and 6 in the paper? #83

Open meteorlium opened 3 months ago

meteorlium commented 3 months ago

Discussion

No response