Error when evaluating with lmms-eval on multi-gpus

Oryx-mllm / Oryx

MLLM for On-Demand Spatial-Temporal Understanding at Arbitrary Resolution

https://oryx-mllm.github.io

264 stars 11 forks source link

Error when evaluating with lmms-eval on multi-gpus #13

Closed yuanrr closed 5 days ago

yuanrr commented 6 days ago

Hello, I followed the scripts to evaluate the model using lmms-eval on the nextqa dataset. I followed lmms-eval's instructions and changed the command to conduct on multi gpus. The error comes with: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0! (when checking argument for argument weight in method wrapper_CUDA__cudnn_convolution) And here are the commands:

export LOWRES_RESIZE=384x32
export VIDEO_RESIZE="0x64"
export HIGHRES_BASE="0x32"
export MAXRES=1536
export MINRES=0
export VIDEO_MAXRES=480
export VIDEO_MINRES=288
CUDA_VISIBLE_DEVICES=2,3 python \
-m lmms_eval \
--model oryx \
--model_args pretrained='/model/Oryx-7b/',device_map=auto,max_frames_num=64,mm_resampler_type="dynamic_compressor" \
--tasks nextqa_mc_test \
--batch_size 1 \
--log_samples \
--log_samples_suffix aa \
--output_path /code/Oryx/log/ \
--verbosity=DEBUG

Please help me solve this... tks a lot

dongyh20 commented 5 days ago

Can you provide more details on the computation resources you are using? If you use A100 gpus, then you can directly follow our scripts and use accelerate to control the processes to evaluate on multi-gpus.

yuanrr commented 5 days ago

Sure, I am using 4090 GPUs, if I follow the scripts, it would be out of memory...Could you please show the scripts for multi-gpus? Am I making some mistakes?

yuanrr commented 5 days ago

I try the following scripts, but it still has the same error: "Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0! (when checking argument for argument weight in method wrapper_CUDA__cudnn_convolution) "

export LOWRES_RESIZE=384x32
export VIDEO_RESIZE="0x64"
export HIGHRES_BASE="0x32"
export MAXRES=1536
export MINRES=0
export VIDEO_MAXRES=480
export VIDEO_MINRES=288
CUDA_VISIBLE_DEVICES=2,3 accelerate launch --num_processes=1 \
-m lmms_eval \
--model oryx \
--model_args pretrained='/home/mcc_yss/data/yuanbw/model/Oryx-7b/',device_map=auto,max_frames_num=64,mm_resampler_type="dynamic_compressor" \
--tasks nextqa_mc_test \
--batch_size 1 \
--log_samples \
--log_samples_suffix llava_next \
--output_path /home/mcc_yss/data/yuanbw/code/Oryx/log1/ \
--verbosity=DEBUG

dongyh20 commented 5 days ago

I see, for multi-gpus, you can refer to the evaluation scripts of 34B model. You may add "export EVAL_LARGE=1" and see if the problem can be solved

yuanrr commented 5 days ago

Thank you so much! It works for me : )