benoitc / gunicorn

gunicorn 'Green Unicorn' is a WSGI HTTP Server for UNIX, fast clients and sleepy applications.
http://www.gunicorn.org
Other
9.71k stars 1.74k forks source link

Adding support for the uwsgi protocol #2806

Open satchamo opened 2 years ago

satchamo commented 2 years ago

The uwsgi protocol is basically packing request headers into a key/value list with size information to make parsing "fast". Nginx has support for it via the ngx_http_uwsgi_module

I hacked up the Gunicorn source, and can now successfully send GET and POST requests using the uwsgi protocol in Nginx. The hack job was ~50 lines, and probably has at least a dozen problems.

Before putting more effort into it, I wanted to see if the Gunicorn developers were open to supporting the protocol.

Uwsgi sells itself as a faster protocol than HTTP, but I doubt it will make much of a difference. The primary benefit for me is that the REMOTE_ADDR can be set without mucking up the application layer (which admittedly, doesn't sound like a very compelling reason to muck up the Gunicorn source).

Here's a proof of concept: https://github.com/benoitc/gunicorn/compare/master...satchamo:master

What say you?

benoitc commented 2 years ago

Looks interresting. Do you have any example of usage of this protocol?

satchamo commented 2 years ago

Sure. You can setup a dummy app in Nginx with a server directive like this:

    server {
        listen 7000;

        location / {
            # this is usually replaced with 'include uwsgi_params;'
            # I'm just manually including each uwsgi_param for illustrative purposes
            uwsgi_param  QUERY_STRING       $query_string;
            uwsgi_param  REQUEST_METHOD     $request_method;
            uwsgi_param  CONTENT_TYPE       $content_type;
            uwsgi_param  CONTENT_LENGTH     $content_length;

            uwsgi_param  REQUEST_URI        $request_uri;
            uwsgi_param  PATH_INFO          $document_uri;
            uwsgi_param  DOCUMENT_ROOT      $document_root;
            uwsgi_param  SERVER_PROTOCOL    $server_protocol;
            uwsgi_param  REQUEST_SCHEME     $scheme;
            uwsgi_param  HTTPS              $https if_not_empty;

            uwsgi_param  REMOTE_ADDR        $remote_addr;
            uwsgi_param  REMOTE_PORT        $remote_port;
            uwsgi_param  SERVER_PORT        $server_port;
            uwsgi_param  SERVER_NAME        $server_name;

            uwsgi_pass   unix:/tmp/uwsgi.socket;
        }
    }

Then write your standard wsgi app (my_app.py)

def application(environ, start_response):
    status = '200 OK'
    response = "\n".join(f"{key}: {value}" for key, value in environ.items()).encode("utf8")
    headers = [
        ('Content-type', 'text/plain; charset=utf-8'),
        ("Content-Length", str(len(response)))
    ]

    start_response(status, headers)

    return [response]

Then run gunicorn: python3 -m gunicorn --log-level debug --uwsgi --bind unix:/tmp/uwsgi.socket my_app:application

And curl it curl http://127.0.0.1:7000/

If you really want to see the protocol in action at at lower level, you can use this (uwsgi_test.py):

import struct
import socket
import argparse

def query(socket_path, host):
    fields = {
        'QUERY_STRING': '',
        'REQUEST_METHOD': 'GET',
        'CONTENT_TYPE': '',
        'CONTENT_LENGTH': '',
        'REQUEST_URI': '/',
        'PATH_INFO': '/',
        'DOCUMENT_ROOT': '/usr/local/openresty/nginx/html',
        'SERVER_PROTOCOL': 'HTTP/1.1',
        'REQUEST_SCHEME': 'https',
        'HTTPS': 'on',
        'REMOTE_ADDR': '10.0.0.138',
        'REMOTE_PORT': '53204',
        'SERVER_PORT': '443',
        'SERVER_NAME': host,
        'HTTP_HOST' : host
    }

    data = b""
    for k, v in fields.items():
        k = k.encode("utf8")
        v = v.encode("utf8")
        data += struct.pack(f"<H{len(k)}sH{len(v)}s", len(k), k, len(v), v)

    data = b"\x00" + struct.pack("<H", len(data)) + b"\x00" + data
    sock = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
    sock.settimeout(2)
    sock.connect(socket_path)
    sock.sendall(data)
    response = ""
    while True:
        d = sock.recv(8192).decode("utf8")
        if not d:
            break
        response += d

    print(response)

if __name__ == "__main__":
    parser = argparse.ArgumentParser(description='Hit a uwsgi socket')
    parser.add_argument('--host', help='Hostname')
    parser.add_argument('--phrase', help='The phrase to look for in the response')
    parser.add_argument('--path', help='The path to the socket')
    args = parser.parse_args()
    query(args.path, args.host)

And then run python3 uwsgi_test.py --host example.com --path /tmp/uwsgi.socket

SchoolGuy commented 2 years ago

So I found this feature request because I assumed that Gunicorn can act as a drop-in replacement for mod_wsgi. Since the app I am using is depending on REQUEST_URI at the moment (Gunicorn instead has RAW_URI), full support for this would be a very nice feature.