timoklimmer / powerproxy-aoai

Monitors and processes traffic to and from Azure OpenAI endpoints.
MIT License
83 stars 23 forks source link

AOAI errors not returned on streaming responses #83

Closed codylittle closed 1 month ago

codylittle commented 1 month ago

PowerProxy returns "Internal Server Error" when the AOAI endpoint errors.

An easy to replicate example of this is to request a deployment that doesn't exist.

POST: /openai/deployments/**notreal**/chat/completions?api-version=2023-07-01-preview

{
    "messages": [
        {
            "role": "system",
            "content": "You are an AI assistant"
        },
        {
            "role": "user",
            "content": "Hello!"
        }
    ],
    "stream": true
}

Expected: Content-Type: application/json

{
    "error": {
        "code": "DeploymentNotFound",
        "message": "The API deployment for this resource does not exist. If you created the deployment within the last 5 minutes, please wait a moment and try again."
    }
}

Returned: Content-Type: text/plain; charset=utf-8

Internal Server Error
Stack trace
ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/uvicorn/protocols/http/httptools_impl.py", line 399, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/fastapi/applications.py", line 1054, in __call__
    await super().__call__(scope, receive, send)
  File "/usr/local/lib/python3.11/site-packages/starlette/applications.py", line 123, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/usr/local/lib/python3.11/site-packages/starlette/middleware/errors.py", line 186, in __call__
    raise exc
  File "/usr/local/lib/python3.11/site-packages/starlette/middleware/errors.py", line 164, in __call__
    await self.app(scope, receive, _send)
  File "/usr/local/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 65, in __call__
    await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
  File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
    raise exc
  File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    await app(scope, receive, sender)
  File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 756, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 776, in app
    await route.handle(scope, receive, send)
  File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 297, in handle
    await self.app(scope, receive, send)
  File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 77, in app
    await wrap_app_handling_exceptions(app, request)(scope, receive, send)
  File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
    raise exc
  File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    await app(scope, receive, sender)
  File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 72, in app
    response = await func(request)
               ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/fastapi/routing.py", line 278, in app
    raw_response = await run_endpoint_function(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/fastapi/routing.py", line 191, in run_endpoint_function
    return await dependant.call(**values)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/powerproxy.py", line 309, in handle_request
    f"Text: {aoai_response.text} "
             ^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/httpx/_models.py", line 576, in text
    content = self.content
              ^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/httpx/_models.py", line 570, in content
    raise ResponseNotRead()
httpx.ResponseNotRead: Attempted to access streaming response content, without having called `read()`.
timoklimmer commented 1 month ago

Hey @codylittle -- thanks for sharing. That's indeed a bug. I have fixed it and released a new version. Enjoy!

codylittle commented 1 month ago

Hey @timoklimmer, thanks for looking at it so quickly, unfortunately this hasn't resolved the issue. I should've been a bit more specific within my issue. I wasn't referring explicitly to non existent deployments, but instead any AOAI returned error.

The error itself occurs when trying to read the response body before reading it on L329 when trying to read the contents of the response before reading the request.

        # got http code other than 200
        if aoai_response.status_code != 200:
+           # read the response body in case it's a stream
+           await aoai_response.aread()
+           
            # print infos to console
            print(

Is how we've resolved this in our private fork, our testing shows that it's successfully resolved the issue. Apologies for not submitting a PR, unsure if you'd like a test to go along with it, and haven't written one for this customization in our org yet.

timoklimmer commented 1 month ago

Ah ok, I see. Seems like this needs another fix. How about non-streaming requests? Would your suggestion work for those, too?

codylittle commented 1 month ago

From our testing, no exceptions are thrown, nor any noticeable degradation in performance

codylittle commented 4 weeks ago

We've decided to change to the below, as to not be run on non-streaming requests.

        # got http code other than 200
        if aoai_response.status_code != 200:
+           # read the response body in case it's a stream
+           if not routing_slip["is_non_streaming_response_requested"]:
+               await aoai_response.aread()
+           
            # print infos to console
            print(
timoklimmer commented 2 weeks ago

@codylittle Thank you for your suggestion -- I have adopted it and will have it in the next release.

timoklimmer commented 2 weeks ago

The next release has been released yesterday, including this fix 🎉