abersheeran / a2wsgi

Convert WSGI app to ASGI app or ASGI app to WSGI app.
Apache License 2.0
226 stars 20 forks source link

Unhandled error in ASGI middleware below wrapped WSGI app causes the app to hang. #48

Closed nickvalin1 closed 7 months ago

nickvalin1 commented 7 months ago

Possibly related to https://github.com/abersheeran/a2wsgi/issues/45 or the attempted fixes for it.

I have an ASGI stack that looks like (connexion framework):

Uvicorn -> Starlette (exception handling layer) -> request validation middleware -> WSGIMiddleware(FlaskApp) -> response validation middleware

Starting with version 1.10.1, if there is an exception in the response validation middleware layer, which is below the wrapped WSGI app, the app will hang and not return a response or be able to accept other incoming requests.

I can tell the exception never propagates back from the child ASGI middleware to the wrapped WSGI app, because if I add an exception handler the the WSGI Flask App, the exception never makes it there.

abersheeran commented 7 months ago

Can you provide a minimal reproducible example? Also, 1.10.2 has been released, please try the new version.

rslinckx commented 7 months ago

Here is I think something similar:

uvicornfail.py:

from a2wsgi import WSGIMiddleware
from flask import Flask
from starlette.applications import Starlette
from starlette.routing import Route

flask_app = Flask("test")

@flask_app.route("/healthz")
def healthz():
    return "OK"

app = Starlette(
    routes=[Route("/{path:path}", endpoint=WSGIMiddleware(flask_app.wsgi_app, workers=1))],
)

Using:

Run with: uvicorn uvicornfail:app --limit-concurrency 3

In another shell do something like: while true; do curl localhost:8000/healthz ; done

Notice after a few requests, the uvicorn starts responding with "Service Unavailable". On the uvicorn logs there is nothing wrong yet. Now when you try to ctrl-c uvicorn, it starts outputting a series of tracebacks (presumably one for each hung request):

ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/path/.venv/lib/python3.12/site-packages/uvicorn/protocols/http/httptools_impl.py", line 412, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/.venv/lib/python3.12/site-packages/uvicorn/middleware/proxy_headers.py", line 69, in __call__
    return await self.app(scope, receive, send)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/.venv/lib/python3.12/site-packages/starlette/applications.py", line 123, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/path/.venv/lib/python3.12/site-packages/starlette/middleware/errors.py", line 164, in __call__
    await self.app(scope, receive, _send)
  File "/path/.venv/lib/python3.12/site-packages/starlette/middleware/exceptions.py", line 65, in __call__
    await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
  File "/path/.venv/lib/python3.12/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    await app(scope, receive, sender)
  File "/path/.venv/lib/python3.12/site-packages/starlette/routing.py", line 756, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/path/.venv/lib/python3.12/site-packages/starlette/routing.py", line 776, in app
    await route.handle(scope, receive, send)
  File "/path/.venv/lib/python3.12/site-packages/starlette/routing.py", line 297, in handle
    await self.app(scope, receive, send)
  File "/path/.venv/lib/python3.12/site-packages/a2wsgi/wsgi.py", line 166, in __call__
    return await responder(scope, receive, send)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/.venv/lib/python3.12/site-packages/a2wsgi/wsgi.py", line 205, in __call__
    await self.send_queue.join()
  File "/Users/r.slinckx/.pyenv/versions/3.12.1/lib/python3.12/asyncio/queues.py", line 215, in join
    await self._finished.wait()
  File "/Users/r.slinckx/.pyenv/versions/3.12.1/lib/python3.12/asyncio/locks.py", line 212, in wait
    await fut
asyncio.exceptions.CancelledError
Task exception was never retrieved
future: <Task finished name='Task-59' coro=<WSGIResponder.sender() done, defined at /path/.venv/lib/python3.12/site-packages/a2wsgi/wsgi.py:222> exception=ClientDisconnected()>
Traceback (most recent call last):
  File "/path/.venv/lib/python3.12/site-packages/a2wsgi/wsgi.py", line 228, in sender
    await send(message)
  File "/path/.venv/lib/python3.12/site-packages/starlette/_exception_handler.py", line 50, in sender
    await send(message)
  File "/path/.venv/lib/python3.12/site-packages/starlette/middleware/errors.py", line 161, in _send
    await send(message)
  File "/path/.venv/lib/python3.12/site-packages/uvicorn/protocols/http/httptools_impl.py", line 465, in send
    raise ClientDisconnected
uvicorn.protocols.utils.ClientDisconnected

The exception is something new in uvicorn 0.28, but it seems there's something wrong when handling it in a2wsgi

abersheeran commented 7 months ago

I haven't reproduced this problem in Uvicorn 0.28. Judging from the traceback, this is caused by the sender sending data to the client too slowly.

rslinckx commented 7 months ago

(I edited, There was a typo in my previous message (route /healhtz -> /healthz), but this shouldn't change the result)

The provided example does nothing particularly slow, and the while true does curl sequentially, so I would expect curl to wait until the response before starting a new connection, there's no concurrency in this example. It appears as if something in the handling of the response gets stuck somewhere. Just to be sure here are some more explicit command line arguments for uvicorn (although i get the same with --http h11 and --loop uvloop): .venv/bin/uvicorn uvicornfail:app --loop asyncio --http httptools --limit-concurrency 2

Also this is with flask==3.0.2

I tried with modifying a2wsgi.wsgi.WSGIResponder.sender as follow:

    async def sender(self, send: Send) -> None:
        while True:
            message = await self.send_queue.get()
            self.send_queue.task_done()
            print(f'sender got message: {message!r}')
            if message is None:
                return
            from uvicorn.protocols.utils import ClientDisconnected
            try:
                await send(message)
            except ClientDisconnected:
                import traceback; traceback.print_exc()
                continue

Which gives the following logs:

Working request:
sender got message: {'type': 'http.response.body', 'body': b'OK', 'more_body': True}
sender got message: {'type': 'http.response.body', 'body': b''}
sender got message: None
sender got message: {'type': 'http.response.start', 'status': 200, 'headers': [(b'content-type', b'text/html; charset=utf-8'), (b'content-length', b'2')]}
INFO:     127.0.0.1:58246 - "GET /healthz HTTP/1.1" 200 OK

Weird request:
sender got message: {'type': 'http.response.body', 'body': b'OK', 'more_body': True}
sender got message: {'type': 'http.response.body', 'body': b''}
Traceback (most recent call last):
  File "/.venv/lib/python3.12/site-packages/a2wsgi/wsgi.py", line 231, in sender
    await send(message)
  File "/.venv/lib/python3.12/site-packages/starlette/_exception_handler.py", line 50, in sender
    await send(message)
  File "/.venv/lib/python3.12/site-packages/starlette/middleware/errors.py", line 161, in _send
    await send(message)
  File "/.venv/lib/python3.12/site-packages/uvicorn/protocols/http/httptools_impl.py", line 469, in send
    raise ClientDisconnected
uvicorn.protocols.utils.ClientDisconnected
sender got message: None
sender got message: {'type': 'http.response.start', 'status': 200, 'headers': [(b'content-type', b'text/html; charset=utf-8'), (b'content-length', b'2')]}

Due to adding an except: continue, the server keeps going now, but I can't explain why the client disconnects at that point

abersheeran commented 7 months ago

@rslinckx Thank you for the DEBUG information. It is true that a2wsgi should not trust send to be successful. I have released version 1.10.4, please try it again when you have time. (I can't reproduce it locally)

https://github.com/abersheeran/a2wsgi/blob/8169f0ac6c9c6a672d5d06c530224eb59fee9e02/a2wsgi/wsgi.py#L192-L213

rslinckx commented 7 months ago

With the new release I can't reproduce the problem, and all requests are handled properly in the above code sample. I'm not entirely sure why the ClientDisconnected got triggered though, but that may be something else. In any case I don't see any visible problems. I'll let you know if it doesn't work properly in a more real environment when it is deployed.

Thanks a lot!

nickvalin1 commented 7 months ago

@rslinckx thanks for creating that example, I was struggling to create one that was minimal. @abersheeran I can confirm that my issue is fixed in version 1.10.4 as well.