BerriAI / litellm

Call all LLM APIs using the OpenAI format. Use Bedrock, Azure, OpenAI, Cohere, Anthropic, Ollama, Sagemaker, HuggingFace, Replicate (100+ LLMs)
https://docs.litellm.ai/docs/
Other
10.05k stars 1.12k forks source link

[Bug]: Cannot handle the error due to "During handling of the above exception, another exception occurred:". #4233

Closed tsujimic closed 1 week ago

tsujimic commented 1 week ago

What happened?

Exception handling using litellm python sdk fails due to "During handling of the above exception, another exception occurred:". When a content filter error occurs in Azure OpenAI Service, it sometimes occurs in exception handling.

Relevant log output

Traceback (most recent call last):
    |   File "/Users/admin/.venv/lib/python3.11/site-packages/litellm/utils.py", line 8564, in chunk_creator
    |     response_obj = self.handle_openai_chat_completion_chunk(chunk)
    |                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |   File "/Users/admin/.venv/lib/python3.11/site-packages/litellm/utils.py", line 7869, in handle_openai_chat_completion_chunk
    |     raise e
    |   File "/Users/admin/.venv/lib/python3.11/site-packages/litellm/utils.py", line 7839, in handle_openai_chat_completion_chunk
    |     raise litellm.AzureOpenAIError(
    | litellm.llms.azure.AzureOpenAIError: Azure Response={'id': 'chatcmpl-9b2F6qpAChf8V0ceoUCijGPsXBWir', 'choices': [Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=None), finish_reason='content_filter', index=0, logprobs=None, content_filter_results={'hate': {'filtered': False, 'severity': 'safe'}, 'self_harm': {'filtered': False, 'severity': 'safe'}, 'sexual': {'filtered': False, 'severity': 'safe'}, 'violence': {'filtered': True, 'severity': 'medium'}})], 'created': 1718613784, 'model': 'gpt-35-turbo-16k', 'object': 'chat.completion.chunk', 'system_fingerprint': None, 'usage': None}
    | 
    | During handling of the above exception, another exception occurred:
    | 
    | Traceback (most recent call last):
    |   File "/Users/admin/.venv/lib/python3.11/site-packages/litellm/utils.py", line 8904, in __next__
    |     response: Optional[ModelResponse] = self.chunk_creator(chunk=chunk)
    |                                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |   File "/Users/admin/.venv/lib/python3.11/site-packages/litellm/utils.py", line 8834, in chunk_creator
    |     raise exception_type(
    |           ^^^^^^^^^^^^^^^
    |   File "/Users/admin/.venv/lib/python3.11/site-packages/litellm/utils.py", line 7000, in exception_type
    |     raise e
    |   File "/Users/admin/.venv/lib/python3.11/site-packages/litellm/utils.py", line 6869, in exception_type
    |     raise BadRequestError(
    | litellm.exceptions.BadRequestError: litellm.BadRequestError: AzureException - Azure Response={'id': 'chatcmpl-9b2F6qpAChf8V0ceoUCijGPsXBWir', 'choices': [Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=None), finish_reason='content_filter', index=0, logprobs=None, content_filter_results={'hate': {'filtered': False, 'severity': 'safe'}, 'self_harm': {'filtered': False, 'severity': 'safe'}, 'sexual': {'filtered': False, 'severity': 'safe'}, 'violence': {'filtered': True, 'severity': 'medium'}})], 'created': 1718613784, 'model': 'gpt-35-turbo-16k', 'object': 'chat.completion.chunk', 'system_fingerprint': None, 'usage': None}
    | 
    | During handling of the above exception, another exception occurred:
    | 
    | Traceback (most recent call last):
    |   File "/Users/admin/.venv/lib/python3.11/site-packages/starlette/responses.py", line 261, in wrap
    |     await func()
    |   File "/Users/admin/.venv/lib/python3.11/site-packages/starlette/responses.py", line 250, in stream_response
    |     async for chunk in self.body_iterator:
    |   File "/Users/admin/.venv/lib/python3.11/site-packages/starlette/concurrency.py", line 65, in iterate_in_threadpool
    |     yield await anyio.to_thread.run_sync(_next, as_iterator)
    |           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |   File "/Users/admin/.venv/lib/python3.11/site-packages/anyio/to_thread.py", line 56, in run_sync
    |     return await get_async_backend().run_sync_in_worker_thread(
    |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |   File "/Users/admin/.venv/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 2177, in run_sync_in_worker_thread
    |     return await future
    |            ^^^^^^^^^^^^
    |   File "/Users/admin/.venv/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 859, in run
    |     result = context.run(func, *args)
    |              ^^^^^^^^^^^^^^^^^^^^^^^^
    |   File "/Users/admin/.venv/lib/python3.11/site-packages/starlette/concurrency.py", line 54, in _next
    |     return next(iterator)
    |            ^^^^^^^^^^^^^^
    |   File "/Users/admin/python/litellm/fastapi/backend/api/ollama.py", line 185, in data_generator
    |     for chunk in response:
    |   File "/Users/admin/.venv/lib/python3.11/site-packages/litellm/utils.py", line 8959, in __next__
    |     raise exception_type(
    |           ^^^^^^^^^^^^^^^
    |   File "/Users/admin/.venv/lib/python3.11/site-packages/litellm/utils.py", line 7000, in exception_type
    |     raise e
    |   File "/Users/admin/.venv/lib/python3.11/site-packages/litellm/utils.py", line 6869, in exception_type
    |     raise BadRequestError(
    | litellm.exceptions.BadRequestError: litellm.BadRequestError: AzureException - litellm.BadRequestError: AzureException - Azure Response={'id': 'chatcmpl-9b2F6qpAChf8V0ceoUCijGPsXBWir', 'choices': [Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=None), finish_reason='content_filter', index=0, logprobs=None, content_filter_results={'hate': {'filtered': False, 'severity': 'safe'}, 'self_harm': {'filtered': False, 'severity': 'safe'}, 'sexual': {'filtered': False, 'severity': 'safe'}, 'violence': {'filtered': True, 'severity': 'medium'}})], 'created': 1718613784, 'model': 'gpt-35-turbo-16k', 'object': 'chat.completion.chunk', 'system_fingerprint': None, 'usage': None}

Twitter / LinkedIn details

No response

superpoussin22 commented 1 week ago

your prompt is raising a content safety alert because the content safety filter consider you have violence in your prompt.

content_filter_results={'hate': {'filtered': False, 'severity': 'safe'}, 'self_harm': {'filtered': False, 'severity': 'safe'}, 'sexual': {'filtered': False, 'severity': 'safe'}, 'violence': {'filtered': True, 'severity': 'medium'}})]

So it produce an error 400

tsujimic commented 1 week ago

I understand that my prompting is causing a content filter error and a 400 bad request error. If I get this error, and also get "During handling of the above exception, another exception occurred:", then my caller's Python code is not able to handle the error

    try:
        response = await acompletion(
            model=model,
            messages=messages,
            stream=stream,
        )

        async def data_generator():
            async for chunk in response:
                yield chunk.model_dump_json()

        return StreamingResponse(data_generator(), media_type="application/json-lines")

    except Exception as e:
        # When "During handling of the above exception, another exception occurred:",
        # the error cannot be handled.
        return JSONResponse(content=getattr(e, "message", "error occurred")) 
krrishdholakia commented 1 week ago

@tsujimic what is the error raised by your python caller code?

the exception raised is the same - litellm.exceptions.BadRequestError

which should be of type Exception (it inherits from the openai BadRequestError)

krrishdholakia commented 1 week ago

unable to repro - this works for me

from litellm import acompletion, APIError, completion
import asyncio, httpx, traceback
import litellm

async def try_acompletion_error(): 
    model = "azure/gpt-3.5-turbo"
    request = httpx.Request(
                    method="POST",
                    url="https://azure.com/"
                )
    exception_to_raise = APIError(
                status_code=400, 
                message="invalid_request_error",
                llm_provider="azure",
                request=request,
                model="gpt-35-turbo",
            )
    setattr(exception_to_raise, "response", httpx.Response(status_code=400, request=request))
    try:
        response = completion(
            model=model,
            messages=[{"role": "user", "content": "Hey"}],
            api_version="2023-06-12",
            # stream=True,
            mock_response=exception_to_raise,
            num_retries=0
        )

        chunks = [] 
        async def data_generator():
            async for chunk in response:
                chunks.append(chunk)

    except Exception as e:
        # When "During handling of the above exception, another exception occurred:",
        # the error cannot be handled.
        print("ERROR CAUGHT! - {}".format(str(e)))

asyncio.run(try_acompletion_error())

output:

(base) krrishdholakia@Krrishs-MacBook-Air temp_py_folder % python3 linting_tests.py

Give Feedback / Get Help: https://github.com/BerriAI/litellm/issues/new
LiteLLM.Info: If you need to debug this error, use `litellm.set_verbose=True'.

ERROR CAUGHT! - litellm.BadRequestError: AzureException BadRequestError - litellm.APIError: invalid_request_error
(base) krrishdholakia@Krrishs-MacBook-Air temp_py_folder %