多模态模型如何多卡部署

modelscope / ms-swift

Use PEFT or Full-parameter to finetune 300+ LLMs or 80+ MLLMs. (Qwen2, GLM4v, Internlm2.5, Yi, Llama3.1, Llava-Video, Internvl2, MiniCPM-V-2.6, Deepseek, Baichuan2, Gemma2, Phi3-Vision, ...)

https://swift.readthedocs.io/zh-cn/latest/Instruction/index.html

Apache License 2.0

3.39k stars 289 forks source link

多模态模型如何多卡部署 #1236

Closed AlbertBJ closed 2 weeks ago

AlbertBJ commented 2 months ago

我看文档，多模态模型这块的部署，都是针对单卡部署的，那如果单卡太小，如何多卡的 tensor 并行部署

AlbertBJ commented 2 months ago

我看文档，多模态模型这块的部署，都是针对单卡部署的，那如果单卡太小，如何多卡的 tensor 并行部署

我这边用qwen-vl-chat来测试的，设置可见两张卡，模型可以运行，但是我看两张卡上 gpu 显存使用量不一致啊，是因为 vit的存在么？

tastelikefeet commented 2 weeks ago

考虑使用lmdeploy进行部署： --infer_backend lmdeploy