pwsm / httplib2

Automatically exported from code.google.com/p/httplib2
0 stars 0 forks source link

Slow performance with multiple requests over keep-alive HTTP connection #91

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
(As issue 28 is closed as invalid, I'm opening a new issue. I'll try
to demonstrate why it's not invalid.)

When performing multiple requests over a single HTTP connection (using
HTTP/1.1 keepalive), the performance is very bad.

I figured this out while sending many documents to a CouchDB server in
a single connection using couchdb-python [1]. It uses httplib2 for
HTTP communication.

I was able to reproduce the problem with this minimal setup:

testserver.py (attached) sets up a TCP socket, accepts a connection
and reads requests and writes responses. It doesn't really understand
HTTP, it just reads the socket until it gets 'ping'. Then it responds
with dummy HTTP headers and 'pong' in the body.

testclient.py (attached) sets up a httplib2.Http, and sends five
hundred POST requests with body='ping' to the testserver. Because of
HTTP keep-alive, all the requests use the same connection.

Test results with httplib2 trunk:

    $ time python testclient.py

    real    0m20.082s
    user    0m0.040s
    sys     0m0.016s

Test results with the attached patch:

    $ time python testclient.py

    real    0m0.215s
    user    0m0.160s
    sys     0m0.048s

From 25 requests/sec to nearly 2500 requests/sec. Quite a performance
boost :)

The slow performance seems to be caused because of the bad
interoperation of the TCP Nagle algorithm and ACK delaying. Both are
features of the TCP protocol. There's an explanation by John Nagle [2].

I'll try to exaplain it here myself, too. The problem can be seen when
there are two writes and one read over a TCP connection. httplib acts
just like this, first sending the headers and then the body, and the
reading the response.

In the client side:

1. Client sends the HTTP headers -> Packet leaves immediately

2. Client sends the body -> Due to the Nagle algorithm, the packet
   doesn't leave before the server has ACKed the first packet.

   After 50-500ms, the ACK is received and the packet leaves.

3. Client receives the response.

In the server side:

1. Server receives the HTTP headers

2. Server starts waiting for the body. There's no write here, so the
   ACK to the packet received in step 1 is delayed 50-500ms. After the
   ACK has left, the client sends the body.

3. Server receives the body

4. Server sends the response

This can be fixed in the client side by disabling the Nagle algorithm,
and thus not waiting for the ACK for the packet that contains the HTTP
headers. It's just a matter of setting one socket option:

    sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)

For some reason (that's not apparent to me), this is only a problem
when sending many requests over the same TCP connection. There may be
some optimizations in the kernel or in Python's httplib that
circumvent the problem in that case.

The httplib2 tests don't run faster after fixing this because, as far
as I can see, they don't make multiple POST or PUT requests over a
single TCP connection.

[1] http://code.google.com/p/couchdb-python/
[2]
http://developers.slashdot.org/comments.pl?sid=174457&threshold=1&commentsort=0&
mode=thread&cid=14515105

Original issue reported on code.google.com by akhern on 3 Feb 2010 at 12:36

Attachments:

GoogleCodeExporter commented 8 years ago
Fix committed in 141:f900367d947c. 

Original comment by joe.gregorio@gmail.com on 3 Feb 2010 at 1:42

GoogleCodeExporter commented 8 years ago
Nagle vs delayed ack issues have been around for a while. The correct way to 
fix this is NOT to set TCPNODELAY but to fix httplib2 to issue the header and 
body as a combined write. This way small commands are issued in a single packet 
as the gods of TCP intended. Large commands are sent in multiple packets right 
away as Nagle doesn't apply in that case.

The TCPNODELAY fix is a workaround, but it results in inefficient bandwidth use.

Original comment by jan%data...@gtempaccount.com on 12 Jul 2010 at 11:52