benoitc / gunicorn

gunicorn 'Green Unicorn' is a WSGI HTTP Server for UNIX, fast clients and sleepy applications.
http://www.gunicorn.org
Other
9.67k stars 1.74k forks source link

Clarify Gunicorn's advice to use Nginx proxy #3240

Open vanschelven opened 3 weeks ago

vanschelven commented 3 weeks ago

Gunicorn currently opens its advice for deployment with a blanket recommendation for a proxy server, particularly Nginx

We strongly recommend using Gunicorn behind a proxy server. [..] we strongly advise that you use Nginx.

The main reason provided is:

you need to make sure that it buffers slow clients when you use default Gunicorn workers. Without this buffering Gunicorn will be easily susceptible to denial-of-service attacks.

However:

If you want to be able to handle streaming request/responses or other fancy features like Comet, Long polling, or Web sockets, you need to turn off the proxy buffering. When you do this you must run with one of the async worker classes.

It would be nice if the blanket recommendation for Nginx could be broken up into parts, such that a the particular trade-offs / risks become more clear.

In particular:

  1. The only reason I could find in the docs to use Nginx is the denial-of-service attack through slow clients. Are there other reasons that Nginx is desirable? If so, it would be nice to list them, such that individuals can decide whether those reasons apply to them.*

  2. Turning off request buffering is an (obvious) requirement when handling requests in a streaming manner (or long polling etc). Given that doing so puts us right back at the DOS that the first line of the docs warns about, can we be sure that "run[ning] with one of the async worker classes" is enough of a mitigation? Why so† ? Also: if it is sufficient to run with an async worker, wouldn't it be better to clarify this in the opening line of the document ("run with an async worker or run with the sync worker behind a proxy")?

  3. More generally, isn't the need for a proxy almost entirely determined by the worker class? It's the worker class that does all of the reading/writing to sockets, which is what the weak point w.r.t. DOSsing is, right? E.g. the uvicorn docs state that "Using Nginx as a proxy in front of your Uvicorn processes may not be necessary".

  4. The docs mention Hey as a mechanism to test susceptibility to slow clients, but I cannot find a command-line argument to Hey to specifically act as a slow client. Wouldn't it be better to point to e.g. slowloris.py?

=== *A partial answer to this question that I can come up with myself is:

a. virtual hosts b. serving static files c. SSL conf (gunicorn provides SSL, but perhaps not as configurable?) d. "general hardening" (parsing of HTTP?)

†My guess based on my reading of the code is: because they do not block on read, and do not spawn threads/processes per request. But this raises the question: is there anything that could/should be configured, such as the number of connections to accept per worker, and the request timeout? All of this should be explained in the docs.

vanschelven commented 3 weeks ago

[5] how does the (accidental or adversarial) DOS through slow clients compare to other DOS mechanisms against nginx, or on the async workers? (i.e. is this an especially easy, or the only realistic, attack)?

pajod commented 3 weeks ago

Plus, now that we briefly had some fantasies about HTTP/2 speeding up the web and may soon see practical, feature-complete implementations of HTTP/3 in widespread browsers, there is a new argument in favour of proxying that did not exist back when this was originally written.

vanschelven commented 1 week ago

e. Run your webserver at a privileged port (80, 443) as root, while dropping into a less privileged user to run the actual application code