Closed qy1026 closed 1 day ago
when tensor_parallel_size
> 1, the vllm will employ ray to run the model, in you case, it will cost 2 GPU.
I guess when you destroy llm1 the placement group
in ray are not removed, so the 2 GPU were not released. then the llm2 will pending since ray cannot get enough sources(2 GPU) for placement group
.
when
tensor_parallel_size
> 1, the vllm will employ ray to run the model, in you case, it will cost 2 GPU. I guess when you destroy llm1 theplacement group
in ray are not removed, so the 2 GPU were not released. then the llm2 will pending since ray cannot get enough sources(2 GPU) forplacement group
.
Really thank you for your reply. Do you have any ideas for this question?
The simplest way is to stop ray cluster after destroy llm1 and llm2 will start a fresh ray cluster
This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you!
This issue has been automatically closed due to inactivity. Please feel free to reopen if you feel it is still relevant. Thank you!
My Python Script
How would you like to use vllm
When tensor_parallel_size=1, the program worked well. But when tensor_parallel_size=2, it got stuck with
2024-05-21 16:59:38,442 INFO worker.py:1582 -- Calling ray.init() again after it has already been called.
after "model deleted !"