Using multi-GPU with accelerate is not working

EleutherAI / lm-evaluation-harness

A framework for few-shot evaluation of language models.

https://www.eleuther.ai

MIT License

6.88k stars 1.83k forks source link

Using multi-GPU with accelerate is not working #2292

Closed commmet-ahn closed 1 month ago

commmet-ahn commented 1 month ago

accelerate launch -m lm_eval \
    --model hf \
    --model_args pretrained=mistralai/Mistral-7B-v0.3 \
    --tasks winogrande \
    --num_fewshot 0 \
    --batch_size 16 \

I tried using accelerate to speedup evaluation latency for efficient test, but it not works. I thought my script was wrong, so I tried adding 'parallelize=True', 'num_processes=9' or using other models, but it stops running after building contexts as below. I couldn't find any related issue, so I think I'm doing something wrong. Can anybody help me?

BaohaoLiao commented 1 month ago

Your code in the block works for me. It will also output something to inform that I'm using two processes because my machine has two GPUs. Not sure how you add num_processes, but you can try

accelerate launch --num_processes 2 -m lm_eval \
    --model hf \
    --model_args pretrained=mistralai/Mistral-7B-v0.3 \
    --tasks winogrande \
    --num_fewshot 0 \
    --batch_size 16

Or explicitly specify the GPU by export CUDA_VISIBLE_DEVICES=0,1

commmet-ahn commented 1 month ago

Oh, thank you for the kind and quick response. It only works when I set num_processes to 2. I'm not sure why, but it seems to be an issue with the environment I'm using. On another server, I can set num_processes to 8 and it utilizes all 8 GPUs just fine.