Open wac81 opened 5 months ago
Hey @wac81 what does that mean? We already support model cooldowns https://docs.litellm.ai/docs/routing#cooldowns
@krrishdholakia What I mean is that after the health check, it is found that the server has a problem, the route does not request that server, which means that the server is temporarily offline
Oh that's interesting
maybe what we can do is expose a update_deployment()
function in the router and have the health check update deployment status if it's failing
Yes, I think this is very important, you can open update_deployment(), and then give a simple update_deployment() implementation example
The Feature
[Feature]: No more routing to this model after health checking for problem servers
Motivation, pitch
[Feature]: No more routing to this model after health checking for problem servers
Twitter / LinkedIn details
No response