benoitc / gunicorn

gunicorn 'Green Unicorn' is a WSGI HTTP Server for UNIX, fast clients and sleepy applications.
http://www.gunicorn.org
Other
9.73k stars 1.74k forks source link

Backlog size monitoring #3049

Open gmldasilva opened 1 year ago

gmldasilva commented 1 year ago

Hello,

I hope you're doing well.

In my application, I have noticed that gunicorn is experiencing a lot backlog of connections.

Consequently, this is causing delays in processing client requests as they have to wait for the backlog queue before actual processing begins. As a result, the overall client response time is adversely affected, which is concerning.

To tackle this issue and gain better control over client response time, I am interested in exploring the following approaches:

1) Monitoring the size of the gunicorn backlog: By keeping a close eye on the backlog size, we can identify periods of high demand and potentially adjust the server configuration accordingly to handle increased traffic more efficiently.

2) Monitoring the backlog request waiting time: Understanding the backlog request waiting time will help us pinpoint how long clients are waiting before their requests are taken up for processing. This information can be crucial in identifying potential bottlenecks and optimizing the application's performance.

Is one of these approachs possible? Is there another approach to solve this issue?

Thank you for your assistance.

Best regards,

vladimir-avinkin commented 1 year ago

This is not something that is currently possible in both gunicorn AND python itself.

Your best bet would be monitoring TCP listen socket backlog, but it can be misleading with async workers such as uvicorn and gevent that might block in their event loop implementations and not in picking up sockets from TCP backlog. The threading worker might block after handling TCP socket as well, but I'm not sure.

And as far as I know there is no way to monitor event loop utilization for both uvloop and gevent. For threads it's even weirder since you have to basically monitor the GIL utilization.

I think exposing TCP backlog metrics is a start, but I don't think you are able to comprehensively monitor python HTTP servers at all due to those limitations

snuderl commented 1 year ago

There is an example of backlog monitor here https://github.com/benoitc/gunicorn/pull/2407#issuecomment-1337903934