xorbitsai / inference

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
https://inference.readthedocs.io
Apache License 2.0
5k stars 397 forks source link

FEAT: Preservation and auto-restoration of deployed models post-service restart #554

Closed onesuper closed 1 month ago

onesuper commented 11 months ago

The problem

Every time I undergo an upgrade or reinstallation, I find myself having to restart the xinference service. A pain point during this process is the loss of information related to already deployed models. For instance, if I have four different versions of a model deployed, I currently have to restart each of these models post-service restart manually.

What I am thinking

I propose that xinference integrates a feature that allows the service to remember the details of deployed models.

Implementing this feature will significantly enhance the user experience, reduce manual overhead, and ensure seamless continuity in operations even after system upgrades or restarts.

Another idea

If the auto-restart thing is tricky, how about giving me a command that lets me save all my model info into a script? Then, after a restart, I can just run that script and get all my models back online. It's not fully automatic, but it's way better than what I'm doing now.

wencan commented 4 months ago

I think this is very important. In actual production environments, services may need to be restarted for various reasons. We must ensure that the service remains online at all times.

github-actions[bot] commented 1 month ago

This issue is stale because it has been open for 7 days with no activity.

github-actions[bot] commented 1 month ago

This issue was closed because it has been inactive for 5 days since being marked as stale.