xorbitsai / inference

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
https://inference.readthedocs.io
Apache License 2.0
4.82k stars 379 forks source link

A bug about the web ui and the llm model. #1711

Open idiotTest opened 2 months ago

idiotTest commented 2 months ago

Describe the bug

I found a bug is that the gpu memory shows the xinf and the llm model are running,but i can't see the model in web ui.Also,I can't use the llm model.

To Reproduce

At begining,I run qwen1.5-14b-chat in a linux with xinf,llvm.And i do a test,I want to do a stress test to test some indicators.At first time,it seems ok.But when i run again,I found the bug. Also,the customer model registered by user is also lost

image image image

Some guesses

The gpu memory usage exceeds and cause the error?

github-actions[bot] commented 1 month ago

This issue is stale because it has been open for 7 days with no activity.