#3227: Wait for the keepalive period to end before stopping the acceptance of new requests

benoitc / gunicorn

gunicorn 'Green Unicorn' is a WSGI HTTP Server for UNIX, fast clients and sleepy applications.

http://www.gunicorn.org

Other

9.88k stars 1.76k forks source link

#3227: Wait for the keepalive period to end before stopping the acceptance of new requests #3236

Open michdan opened 4 months ago

michdan commented 4 months ago

This change will prevent potential disruptions in request handling. Currently, when a worker is restarted due to the max_request limit and the keepalive period is greater than zero, there is a gap between stopping the acceptance of new requests and starting a new worker. This happens because the process must wait for the lesser of the grace period and the keepalive period, even if there are no pending requests. It waits due to open connections that remain active for the duration of the keepalive period. See also: https://github.com/benoitc/gunicorn/issues/3227 cc: @benoitc @pajod

michdan commented 4 months ago

To test such functionality, consider using an approach inspired by Apache httpd: https://github.com/apache/httpd/blob/trunk/test/modules/core/test_002_restarts.py. This approach involves making assertions based on log output. While it would require implementing some additional tools, this method seems to simplify testing complex scenarios such as multiprocessing and greenlets.

pajod commented 4 months ago

I am not yet confident I understand all possibly implications.. but whether further changed or not.. this definitely needs documentation (It already needed documentation before, but after the change the behaviour can be even more unexpected). Maybe a time table that lists the state transitions from hitting max_requests + jitter, then awaiting keepalive, then awaiting graceful_timeout, then app initialization - and what sort of responsiveness to a) existing and b) to new clients expect in each of the worker states? Those expectation could then be translated into (likely more verbose, less readable) test cases.

michdan commented 1 month ago

@benoitc thanks for review and sorry for delay. I have removed the code you suggested to remove. But I still think that this is not final fix for that issue but at least is a step foreward. Agree that idea to wait for all keepalive sessions is not good and would lead to other problems