PKU-YuanGroup / MoE-LLaVA

Mixture-of-Experts for Large Vision-Language Models
https://arxiv.org/abs/2401.15947
Apache License 2.0
1.9k stars 121 forks source link

[Question] Scale down futher to support IOT usecases? #50

Open kinchahoy opened 6 months ago

kinchahoy commented 6 months ago

Question

I'm trying to see what can run on an 8GB Raspberry Pi 5, and it occours to me that your approach might scale down really well. Any tips for replicating what you did with something like TinyLlama or trying for an 8 bit quantization of LlaVA-Phi? I'd love to try training some sort of student model as an experiment from the more successful models you've trained.

kinchahoy commented 6 months ago

For what it's worth 4 bit quantizations of LLaVA 1.6 work quite well even in the limited context of a Raspberry Pi. I'll try quantizing MOE-LLaVa soon. Let me know if this is interesting.