allenai / open-instruct

Apache License 2.0
1.08k stars 140 forks source link

'OOM' if I set --VLLM #163

Closed YSLIU627 closed 2 days ago

YSLIU627 commented 1 month ago

Hi, I want to use the vllm during the evaluation. But when I set --vllm, it shows the OOM error. My GPU is A6000 and the model for evaluation is 7B. I can evaluate my model on mt-benchmark with vllm. I would appreciate it if you can help.

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 446.00 MiB (GPU 0; 47.53 GiB total capacity; 31.11 GiB already allocated; 3.00 MiB free; 31.19 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

hamishivi commented 1 month ago

Hi, what command are you running? Sometimes I find that when I hit OOM with vllm it's due to other processes taking up GPU memory. It's a bit hard to debug these things via a github issue, but if you post an example command I can try to help.