Open jamesbraza opened 2 days ago
@jamesbraza this is a warning indicating models aren't being put in cooldown.
Are you setting id
for your models?
abc123
This doesn't look to be a known model id for the model list, it would help to repro the error.
Oh sorry I redacted the UUID and made it abc123
, it was a huge UUID, not a specific model name. Should the deployment ID be a big UUID, or a specific model name?
My Router
instantiation is like so, let me know if I am missing something:
router = Router(
model_list=[{
"model_name": "gpt-4o-2024-08-06",
"litellm_params": {"model": "gpt-4o-2024-08-06", "temperature": 0.0},
}]
)
it was a huge UUID,
Yes this is correct
The error being raised was that the model with that uuid couldn't be found.
Okay, I hit this again just now:
litellm_router_instance.model_list
was [{'model_name': 'gpt-4o-2024-08-06', 'litellm_params': {'model': 'gpt-4o-2024-08-06', 'temperature': 0.0}, 'model_info': {'id': 'b622c96ddd11ae0ab2d5badac10abf2bb7977f0b5de7d310a0dc617035bb4e25', 'db_model': False}}]
deployment_id
was 4334f1af9ccd959d655dfa5645b1dfa72e376eee3b9ec388e23ed70814a08f6f
exception_status
was 500
Directly above it in my logs was:
Traceback (most recent call last):
File "/path/to/.venv/lib/python3.12/site-packages/httpx/_transports/default.py", line 72, in map_httpcore_exceptions
yield
File "/path/to/.venv/lib/python3.12/site-packages/httpx/_transports/default.py", line 377, in handle_async_request
resp = await self._pool.handle_async_request(req)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/path/to/.venv/lib/python3.12/site-packages/httpcore/_async/connection_pool.py", line 216, in handle_async_request
raise exc from None
File "/path/to/.venv/lib/python3.12/site-packages/httpcore/_async/connection_pool.py", line 196, in handle_async_request
response = await connection.handle_async_request(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/path/to/.venv/lib/python3.12/site-packages/httpcore/_async/connection.py", line 101, in handle_async_request
return await self._connection.handle_async_request(request)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/path/to/.venv/lib/python3.12/site-packages/httpcore/_async/http11.py", line 143, in handle_async_request
raise exc
File "/path/to/.venv/lib/python3.12/site-packages/httpcore/_async/http11.py", line 113, in handle_async_request
) = await self._receive_response_headers(**kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/path/to/.venv/lib/python3.12/site-packages/httpcore/_async/http11.py", line 186, in _receive_response_headers
event = await self._receive_event(timeout=timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/path/to/.venv/lib/python3.12/site-packages/httpcore/_async/http11.py", line 238, in _receive_event
raise RemoteProtocolError(msg)
httpcore.RemoteProtocolError: Server disconnected without sending a response.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/path/to/.venv/lib/python3.12/site-packages/openai/_base_client.py", line 1554, in _request
response = await self._client.send(
^^^^^^^^^^^^^^^^^^^^^^^^
File "/path/to/.venv/lib/python3.12/site-packages/httpx/_client.py", line 1674, in send
response = await self._send_handling_auth(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/path/to/.venv/lib/python3.12/site-packages/httpx/_client.py", line 1702, in _send_handling_auth
response = await self._send_handling_redirects(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/path/to/.venv/lib/python3.12/site-packages/httpx/_client.py", line 1739, in _send_handling_redirects
response = await self._send_single_request(request)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/path/to/.venv/lib/python3.12/site-packages/httpx/_client.py", line 1776, in _send_single_request
response = await transport.handle_async_request(request)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/path/to/.venv/lib/python3.12/site-packages/httpx/_transports/default.py", line 376, in handle_async_request
with map_httpcore_exceptions():
File "/path/to/.pyenv/versions/3.12.5/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/contextlib.py", line 158, in __exit__
self.gen.throw(value)
File "/path/to/.venv/lib/python3.12/site-packages/httpx/_transports/default.py", line 89, in map_httpcore_exceptions
raise mapped_exc(message) from exc
httpx.RemoteProtocolError: Server disconnected without sending a response.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/path/to/.venv/lib/python3.12/site-packages/litellm/llms/OpenAI/openai.py", line 944, in acompletion
headers, response = await self.make_openai_chat_completion_request(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/path/to/.venv/lib/python3.12/site-packages/litellm/llms/OpenAI/openai.py", line 639, in make_openai_chat_completion_request
raise e
File "/path/to/.venv/lib/python3.12/site-packages/litellm/llms/OpenAI/openai.py", line 627, in make_openai_chat_completion_request
await openai_aclient.chat.completions.with_raw_response.create(
File "/path/to/.venv/lib/python3.12/site-packages/openai/_legacy_response.py", line 370, in wrapped
return cast(LegacyAPIResponse[R], await func(*args, **kwargs))
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/path/to/.venv/lib/python3.12/site-packages/openai/resources/chat/completions.py", line 1412, in create
return await self._post(
^^^^^^^^^^^^^^^^^
File "/path/to/.venv/lib/python3.12/site-packages/openai/_base_client.py", line 1821, in post
return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/path/to/.venv/lib/python3.12/site-packages/openai/_base_client.py", line 1515, in request
return await self._request(
^^^^^^^^^^^^^^^^^^^^
File "/path/to/.venv/lib/python3.12/site-packages/openai/_base_client.py", line 1588, in _request
raise APIConnectionError(request=request) from err
openai.APIConnectionError: Connection error.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/path/to/.venv/lib/python3.12/site-packages/litellm/main.py", line 430, in acompletion
response = await init_response
^^^^^^^^^^^^^^^^^^^
File "/path/to/.venv/lib/python3.12/site-packages/litellm/llms/OpenAI/openai.py", line 995, in acompletion
raise OpenAIError(
litellm.llms.OpenAI.openai.OpenAIError: Connection error.
I don't understand how I am getting a different deployment ID, I am thinking there might be a control flow issue within LiteLLM's Router
here?
our id's are a stable hash. This would imply something in the model init / params is changing.
Are you calling set_model_list()
in your code?
the only time the id is set, is in here
No I don't call set_model_list
anywhere in my source.
I can also confirm that my above litellm_router_instance.model_list
makes sense.
In [1]: import litellm
In [2]: router = litellm.Router(
...: model_list=[{
...: "model_name": "gpt-4o-2024-08-06",
...: "litellm_params": {"model": "gpt-4o-2024-08-06", "temperature": 0.0},
...: }]
...: )
In [3]: router.model_list
Out[3]:
[{'model_name': 'gpt-4o-2024-08-06',
'litellm_params': {'model': 'gpt-4o-2024-08-06', 'temperature': 0.0},
'model_info': {'id': 'b622c96ddd11ae0ab2d5badac10abf2bb7977f0b5de7d310a0dc617035bb4e25',
'db_model': False}}]
In [4]: router = litellm.Router(
...: model_list=[{
...: "model_name": "gpt-4-turbo-2024-04-09",
...: "litellm_params": {"model": "gpt-4-turbo-2024-04-09", "temperature": 0.0},
...: }]
...: )
In [5]: router.model_list
Out[5]:
[{'model_name': 'gpt-4-turbo-2024-04-09',
'litellm_params': {'model': 'gpt-4-turbo-2024-04-09', 'temperature': 0.0},
'model_info': {'id': '4334f1af9ccd959d655dfa5645b1dfa72e376eee3b9ec388e23ed70814a08f6f',
'db_model': False}}]
So gpt-4o-2024-08-06
is the inference model, and gpt-4-turbo-2024-04-09
is a grader model (gets invoked after the inference model).
I think there is some race condition going on related to Router
cooldown. I am not sure if it's a race condition in my own code or in LiteLLM's code.
_deployment = litellm_router_instance.get_deployment(model_id=deployment_id)
Regardless, I don't understand why the deployment didn't previously exist, because this error happened like 20 mins into a run, so both models should have been invoked by then
What happened?
Randomly with
litellm==1.48.2
, my entire terminal will get filled with LiteLLM warnings aboutrouter_cooldown_event_callback
.I am not sure why it pops up, but I think it's undesirable as it will print pages and pages.
Relevant log output
Twitter / LinkedIn details
No response