'OOM' if I set --VLLM - Githubissues

Hi, I want to use the vllm during the evaluation. But when I set --vllm, it shows the OOM error. My GPU is A6000 and the model for evaluation is 7B. I can evaluate my model on mt-benchmark with vllm. I would appreciate it if you can help.

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 446.00 MiB (GPU 0; 47.53 GiB total capacity; 31.11 GiB already allocated; 3.00 MiB free; 31.19 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

allenai / open-instruct

'OOM' if I set --VLLM #163