[Question] Inconsistency on MoE Layer Number in paper and model config

PKU-YuanGroup / MoE-LLaVA

Mixture-of-Experts for Large Vision-Language Models

https://arxiv.org/abs/2401.15947

Apache License 2.0

1.9k stars 121 forks source link

[Question] Inconsistency on MoE Layer Number in paper and model config #80

Open QAQdev opened 4 months ago

QAQdev commented 4 months ago

Question

In the paper it says StableLM has 32 hidden layers and half of them are used as MoE layers, which should be 16. But I checked the model you open-resourced, the config says it has only 24 layers, so MoE layers should be 12 according to this.

I wonder if there is a slight error in the paper? Thanks for answering in advance.