unbit / uwsgi

uWSGI application server container
http://projects.unbit.it/uwsgi
Other
3.46k stars 691 forks source link

Closing connection prematurely after POST response #2133

Open gward opened 4 years ago

gward commented 4 years ago

I have a tiny WSGI app that supports two requests, "POST /update" and "GET /data". It works fine at the command line, where every request uses a separate connection:

$ curl -i -X POST -d "blah blah" http://localhost:7020/update      
HTTP/1.1 201 Created
Content-Type: text/plain
Content-Length: 8

9 bytes

$ curl -i http://localhost:7020/data
HTTP/1.1 200 OK
Content-Type: text/plain

blah blah

But if I send those same two requests from a Python script that reuses the underlying TCP connection for both requests, it crashes. uwsgi closes the connection after sending the response to the POST, and then Python's http.client.HTTPConnection raises an exception:

Traceback (most recent call last):
  File "send2-httplib.py", line 32, in <module>
    resp = conn.getresponse()
  File "/usr/lib/python3.6/http/client.py", line 1346, in getresponse
    response.begin()
  File "/usr/lib/python3.6/http/client.py", line 307, in begin
    version, status, reason = self._read_status()
  File "/usr/lib/python3.6/http/client.py", line 276, in _read_status
    raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response

To try it yourself and see all my example code, the easiest way is to use a github repo I just created:

$ git clone https://github.com/gward/uwsgi-close.git
$ cd uwsgi-close
$ uwsgi uwsgi.ini

Leave that running, then in another terminal run

python3 send2-httplib.py

Expected outcome: send2-httplib sends the POST, then the GET, and both requests succeed.

Actual outcome: the POST succeeds, but the GET fails with the RemoteDisconnected error shown above.

gward commented 4 years ago

I can reproduce this with:

For the client, I have used http.client under Python 3.6 and 3.7.

We also saw similar behaviour with python-requests, but used http.client for the example code because 1) it behaves more consistently than requests and 2) no dependencies other than stdlib.

gward commented 4 years ago

uwsgi-close-packets

Screenshot of packets captured by wireshark. In words:

I don't see anything in the request or response headers that indicates that uwsgi should close the connection. The request headers are:

POST /update HTTP/1.1
Host: localhost:7020
Accept-Encoding: identity
Content-Length: 6
content-type: text/plain

and the reponse is

HTTP/1.1 201 Created
Content-Type: text/plain
Content-Length: 8

Request body is "hello\n", i.e. 6 bytes: correct Content-Length in request.

Response body is "6 bytes\n", i.e. 8 bytes: correct Content-Length in response.

awelzel commented 4 years ago

Hey @gward ,

I believe when using http-socket this is the expected outcome. Changing your config to the following should work as intended:

[uwsgi]
plugin = python3,http
http = :7020
http-keepalive = 1
wsgi-file = tinyapp-wsgi.py
manage-script-name = true

If you want to expose uwsgi via HTTP without a webserver in front, above configuration is recommended. Possibly check the docs for clarification: https://uwsgi-docs.readthedocs.io/en/latest/HTTP.html

Mind closing?

gward commented 4 years ago

I believe when using http-socket this is the expected outcome.

I've been reading RFC 7230 section 6 backwards and forwards. I can't point at one sentence that lets me claim "a-HA! uwsgi is wrong!", but I believe uwsgi should send "Connection: close" with the response in this case. From https://tools.ietf.org/html/rfc7230#section-6.3:

   HTTP/1.1 defaults to the use of "persistent connections", allowing
   multiple requests and responses to be carried over a single
   connection.  The "close" connection option is used to signal that a
   connection will not persist after the current request/response.  HTTP
   implementations SHOULD support persistent connections.

   A recipient determines whether a connection is persistent or not
   based on the most recently received message's protocol version and
   Connection header field (if any):

   o  If the "close" connection option is present, the connection will
      not persist after the current response; else,

   o  If the received protocol is HTTP/1.1 (or later), the connection
      will persist after the current response; else,

[...]

Since uwsgi sends a "HTTP/1.1 201 Created" response with no "Connection" header, we meet the second criterion here: the connection will persist. A little bit later in the RFC:

   A client MAY send additional requests on a persistent connection
   until it sends or receives a "close" connection option or receives an
   HTTP/1.0 response without a "keep-alive" connection option.

That makes it sound like my client is correct in re-using the existing connection.

Also: I will try your suggested config change -- thank you!

awelzel commented 4 years ago

I believe when using http-socket this is the expected outcome.

I've been reading RFC 7230 section 6 backwards and forwards. I can't point at one sentence that > lets me claim "a-HA! uwsgi is wrong!", but I believe uwsgi should send "Connection: close" with the > response in this case. From https://tools.ietf.org/html/rfc7230#section-6.3:

HTTP/1.1 defaults to the use of "persistent connections", allowing multiple requests and responses to be carried over a single [...]

I don't think you can expect http-socket to behave HTTP/1.1 compliant - compare with the discussion in #1097. There seems to be http11-socket as well - I tested it shortly and it behaves differently (and maybe better for your case), but I still think http + http-keepalive is what you want - I verified your test setup works with that.

Further, my understanding is that http-socket shouldn't be directly exposed to clients. Either use http, or nginx/haproxy in front.

gward commented 4 years ago

Changing your config to the following should work as intended

Confirmed: switching from http-socket to http with http-keepalive seems to work. I need to test in our real-life scenario, but at least it works with my tiny test case.

Further, my understanding is that http-socket shouldn't be directly exposed to clients

It's not. We have only seen this problem running the integration test for our app, which runs client and server in the same Docker container. In real life, our app is deployed in a container behind an AWS load balancer, and I don't think we have seen this problem there.

gward commented 4 years ago

I don't think you can expect http-socket to behave HTTP/1.1 compliant

Then it shouldn't claim to be HTTP/1.1 compliant. Take another look at what my test client logs:

send: b'POST /update HTTP/1.1\r\nHost: localhost:7020\r\nAccept-Encoding: identity\r\nContent-Length: 6\r\ncontent-type: text/plain\r\n\r\n'
reply: 'HTTP/1.1 201 Created\r\n'

Because uwsgi starts its reply with "HTTP/1.1", Python's http.client assumes that it implements HTTP/1.1 semantics: i.e. the connection can be reused as long as the server does not send "Connection: close". It doesn't, so http.client tries to reuse the connection and fails.

Note that I've seen similar behaviour with python-requests, so I don't think this is a bug in http.client.

I think there are two possible ways to change uwsgi to handle this case better:

awelzel commented 4 years ago

Because uwsgi starts its reply with "HTTP/1.1", Python's http.client assumes that it implements HTTP/1.1 semantics: i.e. the connection can be reused as long as the server does not send "Connection: close". It doesn't, so http.client tries to reuse the connection and fails.

Hmm, so my thinking is that a persistent connection may be closed at anytime, whether the server sent Connection: close or not. After a bit of googling - HTTP/1.1 section 8.1.4 (https://tools.ietf.org/html/rfc2616#section-8.1.4) seems relevant.

   [...]. Clients and servers SHOULD both
   constantly watch for the other side of the transport close, and
   respond to it as appropriate. [...]
   This means that clients, servers, and proxies MUST be able to recover
   from asynchronous close events. Client software SHOULD reopen [...]

uwsgi http-socket is just a special degenerated case of predictably closing the connection always. Now, I do agree adding Connection: close would be the right thing. It was not added when #1097 came up - not sure why, maybe save some bytes? Maybe because --http is good enough? :man_shrugging: Want to open a PR?

There was a Python issue suggesting to implement this reopen/retry "automatically", but it seems that it was decided that's to be done on a higher level and the exception therefore propagated (https://bugs.python.org/issue3566#msg234749).

gward commented 4 years ago

Now, I do agree adding Connection: close would be the right thing. [...] Want to open a PR?

Sure! I'm totally unfamiliar with uwsgi source code, though. I looked around a bit the other day and was not sure where to start. So can you give me a hint, and I'll give it a shot?

rdeioris commented 4 years ago

as @awelzel said, you must (not should ;) not use http-socket for an application directly exposed to the public network. there are dozens of attack vectors you can use to bring down the process.

The status code you see is taken blindly from the SERVER_PROTOCOL cgi variable, that (for frontend servers) used to be to set to HTTP/1.0. (nowadays there are various forms of supported persistent connections in proxies, so eventually you can use http11-socket, but the benefits are quite questionable).

By the way, you can add the Connection: close headers from your app (start_response callback in the WSGI case), or directly via uWSGI with the option:

add-header = Connection: close

awelzel commented 4 years ago

add-header = Connection: close

Maybe this should be mentioned as recommended setting whenever --http-socket is used? I'd assume even a loadbalancer/reverse proxy may "appreciate" it being told that the connection is going away.