EvolvingLMMs-Lab / lmms-eval

Accelerating the development of large multimodal models (LMMs) with lmms-eval
https://lmms-lab.github.io/
Other
1.02k stars 52 forks source link

[MultiGPU Evaluation Error]: model on differencet device #79

Closed hxhcreate closed 1 month ago

hxhcreate commented 1 month ago
accelerate launch --num_machines 1 --mixed_precision no --dynamo_backend no --num_processes=4\
 -m lmms_eval --model llava   --model_args pretrained=liuhaotian/lava-v1.5-7b --tasks mme --batch_size 1 --log_samples --log_samples_suffix llava_v1.5_7b_mme --output_path ./logs/ 

I encounter the following errors:

05-10 21:54:46 [lmms_eval/models/llava.py:407] ERROR Error Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1! (when checking argument for argument weight in method wrapper_CUDA__cudnn_convolution) in generating Model Responding: 2%|▏ | 47/2374 [00:04<02:54, 13.37it/s]05-10 21:54:46 [lmms_eval/models/llava.py:407] ERROR Error Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:3! (when checking argument for argument weight in method wrapper_CUDA__cudnn_convolution) in generating

kcz358 commented 1 month ago

Duplicate question, set device_map="", refer to #67

hxhcreate commented 1 month ago

Thanks, that works for me and I'll close this issue