xorbitsai / inference

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
https://inference.readthedocs.io
Apache License 2.0
4.87k stars 386 forks source link

用户调用模型一段时间后模型服务会自动挂掉,但是模型占用的显存依旧存在 #2342

Open Joker-sad opened 1 day ago

Joker-sad commented 1 day ago

System Info / 系統信息

Name: vllm Version: 0.5.0.post1

python =3.9 Name: xinference Version: 0.12.2.post1

Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?

Version info / 版本信息

Name: xinference Version: 0.12.2.post1

The command used to start Xinference / 用以启动 xinference 的命令

q

Reproduction / 复现过程

Expected behavior / 期待表现

期待模型服务启动后可以不在人为干预的条件下一直运行

qinxuye commented 14 hours ago

版本太老,请更新到新版本。