paperswithlove / papers-we-read

3 stars 0 forks source link

CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts #32

Open runhani opened 4 months ago

runhani commented 4 months ago

LLM에서 MoE는 많은 성능 향상을 가지고 왔는데 LMM에서는 MoE를 어떻게 사용해야 할까?

image

Model Zoo

The CuMo model weights are open-sourced at Huggingface: Model Base LLM Vision Encoder MLP Connector Download
CuMo-7B Mistral-7B-Instruct-v0.2 CLIP-MoE MLP-MoE HF ckpt
CuMo-8x7B Mixtral-8x7B-Instruct-v0.1 CLIP-MoE MLP-MoE HF ckpt