[Question] is model parallelism supported for training?

haotian-liu / LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

https://llava.hliu.cc

Apache License 2.0

18.97k stars 2.08k forks source link

[Question] is model parallelism supported for training? #811

Open fredshentu opened 9 months ago

fredshentu commented 9 months ago

Question

Say I have an cluster with 8 GPU but only 12G vram each, I can still train llava? It seems that deepspeed can do a various of model parallelism (tensor parallelism, pipeline etc) I wonder if it is supported on LLaVA

anonymous-atom commented 9 months ago

Yes, they already used deepseed in the scripts. checkout scripts/v1_5/pretrain.sh