Closed tzjtatata closed 7 months ago
Hi, @tzjtatata
Thank you for your interest in our work.
To evaluate a 34b model, you can use tensor parallel to achieve this on 2 A100 40G.
Use llava 1.6 34b as an example. You can pass --model_args "pretrained=liuhaotian/llava-v1.6-34b,conv_template=mistral_direct,device_map=auto
and start your job without using accelerate launch
.
Thank you for careful reply. Can I use accelerate with 4 process x 2gpu?
---Original--- From: @.> Date: Fri, Mar 15, 2024 20:42 PM To: @.>; Cc: @.**@.>; Subject: Re: [EvolvingLMMs-Lab/lmms-eval] [Feature] Evaluate model with 2gpuper process. (Issue #12)
Hi, @tzjtatata
Thank you for your interest in our work.
To evaluate a 34b model, you can use tensor parallel to achieve this on 2 A100 40G.
Use llava 1.6 34b as an example. You can pass --model_args "pretrained=liuhaotian/llava-v1.6-34b,conv_template=mistral_direct,device_map=auto and start your job without using accelerate launch.
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>
@tzjtatata , unfortunately it might not be possible to do so since currently you have to make sure num process to be 1 do activate tensor parallel. We will try to work on features that can support batch size is larger than 1 in the future.
Thank you!
---Original--- From: @.> Date: Sat, Mar 16, 2024 10:17 AM To: @.>; Cc: @.**@.>; Subject: Re: [EvolvingLMMs-Lab/lmms-eval] [Feature] Evaluate model with 2gpuper process. (Issue #12)
@tzjtatata , unfortunately it might not be possible to do so since currently you have to make sure num process to be 1 do activate tensor parallel. We will try to work on features that can support batch size is larger than 1 in the future.
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>
Hi, @tzjtatata
Thank you for your interest in our work.
To evaluate a 34b model, you can use tensor parallel to achieve this on 2 A100 40G.
Use llava 1.6 34b as an example. You can pass
--model_args "pretrained=liuhaotian/llava-v1.6-34b,conv_template=mistral_direct,device_map=auto
and start your job without usingaccelerate launch
.
https://github.com/EvolvingLMMs-Lab/lmms-eval/pull/4
Yes, it's addressed in this PR.
Thank you for your great work. I want to evaluate llava-next 34b, but I only have A100 40B which is not enough for inference 34b in only 1gpu. Can you give me some advices?