PKU-YuanGroup / MoE-LLaVA

Mixture-of-Experts for Large Vision-Language Models
https://arxiv.org/abs/2401.15947
Apache License 2.0
1.9k stars 121 forks source link

[Question] How did u using 768x768 resolution? #53

Open lucasjinreal opened 6 months ago

lucasjinreal commented 6 months ago

Question

Can u share some code how to change it from a base vit encoder which might be 384?