[Feature]: No more routing to this model after health checking for problem servers

BerriAI / litellm

Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]

https://docs.litellm.ai/docs/

Other

12.64k stars 1.47k forks source link

[Feature]: No more routing to this model after health checking for problem servers #3202

Open wac81 opened 5 months ago

wac81 commented 5 months ago

The Feature

[Feature]: No more routing to this model after health checking for problem servers

Motivation, pitch

[Feature]: No more routing to this model after health checking for problem servers

Twitter / LinkedIn details

No response

krrishdholakia commented 5 months ago

Hey @wac81 what does that mean? We already support model cooldowns https://docs.litellm.ai/docs/routing#cooldowns

wac81 commented 4 months ago

@krrishdholakia What I mean is that after the health check, it is found that the server has a problem, the route does not request that server, which means that the server is temporarily offline

krrishdholakia commented 4 months ago

Oh that's interesting

maybe what we can do is expose a update_deployment() function in the router and have the health check update deployment status if it's failing

wac81 commented 4 months ago

Yes, I think this is very important, you can open update_deployment(), and then give a simple update_deployment() implementation example