Use PEFT or Full-parameter to finetune 350+ LLMs or 100+ MLLMs. (LLM: Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, Gemma2, ...; MLLM: Qwen2-VL, Qwen2-Audio, Llama3.2-Vision, Llava, InternVL2, MiniCPM-V-2.6, GLM4v, Xcomposer2.5, Yi-VL, DeepSeek-VL, Phi3.5-Vision, ...)
3.96k
stars
349
forks
source link
load qwen110B model using get_vllm_engine throws error #1081
Closed
phoenixbai closed 3 months ago
Describe the bug I try to load qwen110B model using below code for batch inference, but it throws error:
Your hardware and system info
8 A100 gpu cards:
Additional context
error.log
running error logs is as below, detail log is also attached :