emmett-framework / granian

A Rust HTTP server for Python applications
BSD 3-Clause "New" or "Revised" License
2.86k stars 83 forks source link

`1.2.0` doesn't shut down properly #252

Closed tim-x-y-z closed 6 months ago

tim-x-y-z commented 8 months ago

When upgrading to v1.2.0 it seems the shutdown for asgi doesn't work properly - it seems to be hanging.

gi0baro commented 8 months ago

@tim-habitat can you please provide more details on this (eg: OS/container, Python version, libraries used, granian cmd used) and possibly logs/MRE validating your statement?

tim-x-y-z commented 7 months ago

sorry yes, it was on an arm on mac os using fastapi, python 3.11. both granian reload and not were affected.

i'll try to get a reproducible example here in the next few days!

note: i did have a lifespan to create database tables

zarekr commented 7 months ago

similar issue here with Django running granian as asgi server:

 [ERROR] Application callable raised an exception
 Traceback (most recent call last):
   File "/home/python/.local/lib/python3.12/site-packages/granian/_futures.py", line 4, in future_watcher
     await inner(watcher.scope, watcher.proto)
   File "/home/python/.local/lib/python3.12/site-packages/channels/routing.py", line 62, in __call__
     return await application(scope, receive, send)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   File "/home/python/.local/lib/python3.12/site-packages/channels/sessions.py", line 47, in __call__
     return await self.inner(dict(scope, cookies=cookies), receive, send)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   File "/home/python/.local/lib/python3.12/site-packages/channels/sessions.py", line 263, in __call__
     return await self.inner(wrapper.scope, receive, wrapper.send)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   File "/home/python/.local/lib/python3.12/site-packages/channels/auth.py", line 185, in __call__
     return await super().__call__(scope, receive, send)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   File "/home/python/.local/lib/python3.12/site-packages/channels/middleware.py", line 24, in __call__
     return await self.inner(scope, receive, send)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   File "/home/python/.local/lib/python3.12/site-packages/channels/routing.py", line 116, in __call__
     return await application(
            ^^^^^^^^^^^^^^^^^^
   File "/home/python/.local/lib/python3.12/site-packages/django/core/handlers/asgi.py", line 170, in __call__
     await self.handle(scope, receive, send)
   File "/home/python/.local/lib/python3.12/site-packages/django/core/handlers/asgi.py", line 227, in handle
     await task
   File "/home/python/.local/lib/python3.12/site-packages/django/core/handlers/asgi.py", line 245, in listen_for_disconnect
     raise RequestAborted()
 RequestAborted: 

running in docker on mac m2

gi0baro commented 7 months ago

@zarekr no, it's not similar. Seems more like #186. This issue refers to shutdown.

zarekr commented 7 months ago

@zarekr no, it's not similar. Seems more like #186. This issue refers to shutdown.

Oops. Had the wrong issue open 🥱

RafaelWO commented 7 months ago

I'm experiencing similar issues when trying to shutdown granian. It is stuck on stopping the worker:

[2024-04-18 11:29:02.088 | INFO  ] PUT "/items" HTTP 100 --- middleware
^C[INFO] Shutting down granian
[INFO] Stopping worker-1
[INFO] Stopping worker-1 runtime-1

To stop it I have to kill Granian via SIGKILL.


System:

Granian: 1.2.3
Python:  3.12.1  
Linux:   6.5.0-27-generic #28~22.04.1-Ubuntu x86_64 GNU/Linux
gi0baro commented 7 months ago

@RafaelWO which OS? Linux?

RafaelWO commented 7 months ago

@gi0baro Yes! I updated my comment.

I downgraded to granian==1.1.2 and had no issues - so far :slightly_smiling_face:

RafaelWO commented 7 months ago

It is a weird issue and hard to reproduce...

What I've found so far is that if I disable my custom HTTP middleware I was not able to run into the shutdown bug. (I'm using FastAPI and adding a function via app.add_middleware() of type starlette.middleware.base.BaseHTTPMiddleware.)

gi0baro commented 7 months ago

It is a weird issue and hard to reproduce...

What I've found so far is that if I disable my custom HTTP middleware I was not able to run into the shutdown bug. (I'm using FastAPI and adding a function via app.add_middleware() of type starlette.middleware.base.BaseHTTPMiddleware.)

That sounds quite strange, I agree. Is that still only on 1.2 or also 1.1?

RafaelWO commented 7 months ago

I might have found a somewhat minimal reproducible example:

# file: requirements.txt
fastapi
granian>=1.2
# file: main.py
from fastapi import FastAPI, Request

app = FastAPI()

# If I comment/remove this middleware, everything works fine
@app.middleware("http")
async def log_req(request: Request, call_next):
    response = await call_next(request)
    print("[MIDDLEWARE]", request.method, request.url, response.status_code)
    return response

data = [{"id": 1, "type": "foo"}, {"id": 2, "type": "bar"}]

@app.get("/items")
def read_items():
    print("Hello!")
    return data

To reproduce:

  1. Create a virtual environment using Python 3.12
  2. Install the requirements
  3. Run the app via granian --interface asgi --host "0.0.0.0" --port 8000 main:app
  4. Open the swagger UI at http://localhost:8000/docs
  5. Do at least 4 GET requests to /items via Try it out and Execute or curl (yes, it is weird)
  6. Try to shutdown your app via CTRL+C in your shell where you started granian
  7. You should get stuck (if not try again)

I can reproduce this issue at least 90% of the time :laughing:

gi0baro commented 7 months ago
  1. Do at least 4 GET requests to /items via Try it out and Execute or curl (yes, it is weird)
  2. You should get stuck (if not try again)

I can reproduce this issue at least 90% of the time 😆

LoL. Super weird. Can you do a quick test for me and change the route to be an async function? I suspect this might be related to asyncio executors.

RafaelWO commented 7 months ago

Can you do a quick test for me and change the route to be an async function? I suspect this might be related to asyncio executors.

You are right! If the route is an async def it works fine. Any CTA on this?

P.S. Could you also reproduce this issue with my code above?

gi0baro commented 7 months ago

You are right! If the route is an async def it works fine. Any CTA on this?

I'm not sure, it might be related to the way workers handle signals, but further investigations are needed.

P.S. Could you also reproduce this issue with my code above?

Not yet, I might have some time during the weekend to run tests.

sciyoshi commented 6 months ago

Bisecting shows that this issue is due to #232, specifically the changes to src/callbacks.rs and src/runtime.rs. Something is causing certain requests to have hanging tasks, which prevents threads from shutting down. Incidentally this may also be the cause of https://github.com/emmett-framework/granian/issues/276, because those hanging tasks are preventing some requests from being cleaned up.

gi0baro commented 6 months ago

Bisecting shows that this issue is due to #232, specifically the changes to src/callbacks.rs and src/runtime.rs. Something is causing certain requests to have hanging tasks, which prevents threads from shutting down. Incidentally this may also be the cause of #276, because those hanging tasks are preventing some requests from being cleaned up.

Can you elaborate? When referring to tasks you mean asyncio tasks still running on the Python event loop? Or are you referring to the Rust runtime? Also, which thread gets stuck? You mean the event loop in the Python main thread?

sciyoshi commented 6 months ago

Yes, asyncio tasks on the event loop, and the main thread running the Python event loop in the worker. The issue appears to be related to cancellations - https://github.com/emmett-framework/granian/pull/279/ fixes it in some cases for me, although there seems to be something else that is still broken.

sciyoshi commented 6 months ago

FWIW, the shutdown issue I see when running on our larger codebase also happens on 1.1.2.

gi0baro commented 6 months ago

FWIW, the shutdown issue I see when running on our larger codebase also happens on 1.1.2.

Then it is not related to changes made in #232: 1.1.x still uses the standard PyO3 pyclass implementation. Maybe running some of the code involved in the issue with asyncio debug enabled can give us more info.

sciyoshi commented 6 months ago

The changes from #232 do appear related, because the repro provided fails consistently, but it seems very tricky to figure out 😢

gi0baro commented 6 months ago

@sciyoshi @RafaelWO anybody able to test with current master (https://github.com/emmett-framework/granian/commit/312a264239154ede4410b6621d1bdc0b140e9db4) and see if cancellation theory is correct?

sciyoshi commented 6 months ago

Fixes it for me at least, thanks @gi0baro!

gi0baro commented 6 months ago

Closing this due to changes in 1.3. Feel free to open a new issue in case you experience the same with 1.3.x.

RafaelWO commented 6 months ago

I also tested it again with granian 1.3 and it's fixed! Thx @gi0baro 🙂

Tragio commented 5 months ago

Hi @gi0baro 👋

I have a project with Django Tenants and Django Channels. I tried granian, and made a very dummy frontend with WebSocket. Django Channels on connection sent this to frontend:

      await self.send(
            text_data=json.dumps(
                {"type": "connection_established", "message": "You are now connected!"}
            )
        )

It worked, but after some browser refreshes, I stopped seeing WebSocket information on the granian, and after a couple of minutes it suddenly picked a bunch of those previous requests but never picking the current one. Closing the server and opening again was not solving, only by restarting the computer. 😅

Not sure if it is not closing properly and then starts to throttle, no idea.

Let me know what can I do to help you 🙏

gi0baro commented 5 months ago

@Tragio the issue you reported seems to be unrelated to the OP. Please open up a new issue with a MRE and details about your environment (OS, Python version, Granian version, involved packages, etc).