xorbitsai / inference

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
https://inference.readthedocs.io
Apache License 2.0
5.33k stars 431 forks source link

BUG Service Unresponsive with High Request Volume for Unavailable Models #1743

Open gahoo opened 4 months ago

gahoo commented 4 months ago

Describe the bug

A large number request of model that not in the model list will cause the service not responsive.

To Reproduce

When the model set in immersivetranslate is not running. Then translate a page with a lot of content.

Expected behavior

Throw errors and not freeze.

Additional context

Add any other context about the problem here.

github-actions[bot] commented 3 months ago

This issue is stale because it has been open for 7 days with no activity.