Closed yuanrr closed 5 days ago
Can you provide more details on the computation resources you are using? If you use A100 gpus, then you can directly follow our scripts and use accelerate to control the processes to evaluate on multi-gpus.
Sure, I am using 4090 GPUs, if I follow the scripts, it would be out of memory...Could you please show the scripts for multi-gpus? Am I making some mistakes?
I try the following scripts, but it still has the same error: "Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0! (when checking argument for argument weight in method wrapper_CUDA__cudnn_convolution) "
export LOWRES_RESIZE=384x32
export VIDEO_RESIZE="0x64"
export HIGHRES_BASE="0x32"
export MAXRES=1536
export MINRES=0
export VIDEO_MAXRES=480
export VIDEO_MINRES=288
CUDA_VISIBLE_DEVICES=2,3 accelerate launch --num_processes=1 \
-m lmms_eval \
--model oryx \
--model_args pretrained='/home/mcc_yss/data/yuanbw/model/Oryx-7b/',device_map=auto,max_frames_num=64,mm_resampler_type="dynamic_compressor" \
--tasks nextqa_mc_test \
--batch_size 1 \
--log_samples \
--log_samples_suffix llava_next \
--output_path /home/mcc_yss/data/yuanbw/code/Oryx/log1/ \
--verbosity=DEBUG
I see, for multi-gpus, you can refer to the evaluation scripts of 34B model. You may add "export EVAL_LARGE=1" and see if the problem can be solved
Thank you so much! It works for me : )
Hello, I followed the scripts to evaluate the model using lmms-eval on the nextqa dataset. I followed lmms-eval's instructions and changed the command to conduct on multi gpus. The error comes with:
Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0! (when checking argument for argument weight in method wrapper_CUDA__cudnn_convolution)
And here are the commands:Please help me solve this... tks a lot