Closed yudataguy closed 1 month ago
Tried methods from https://github.com/vllm-project/vllm/issues/1908 no success
the LLM engine internal to the LLM class should get destroyed when your LLM
instance is garbage collected. You could try forcing that with del(llm)
.
more detailed input on this in #3281
going to close this since it's a duplicate of #3281
Your current environment
How would you like to use vllm
I'm running a eval framework that's evaluating multiple models. vllm doesn't seem to free the gpu memory after initialize the 2nd model (with the same variable name), how to free up gpu memory with each vLLMEngine call
llm = LLM(new_model)