Open harshakokel opened 2 months ago
Hi! What version of vLLM are you running with?
@baberabb has observed some problems like this before with later versions ( >v0.3.3 I believe) of vllm.
I am on vllm 0.3.2
.
Is this a vllm problem? Should I be raising an issue on that repo?
Hey. Have you tried caching the weights by running with DP=1 until they are downloaded? I found it prone to hang with DP otherwise.
Yes, the weights are cached. The process is hanging after llm.generate
returns results.
Yes, the weights are cached. The process is hanging after
llm.generate
returns results.
hmm. It's working for me with 0.3.2
. Have you tried running on a fresh virtual environment?
Just tried it on a separate server and new env still face the same issue. What version of ray do you have? Mine is ray==2.10.0
Just tried it on a separate server and new env still face the same issue. What version of ray do you have? Mine is
ray==2.10.0
Probably the latest one. I installed it with pip install -e ".[vllm]"
on runpod with 4 GPUs.
Hello,
I noticed that my process hangs at
results = ray.get(object_refs)
when I usedata_parallel_size
as well astensor_parallel_size
for vllm models.For example, this call would hang.
These would not.
Does anyone else face similar problem?