httprb / http

HTTP (The Gem! a.k.a. http.rb) - a fast Ruby HTTP client with a chainable API, streaming support, and timeouts
MIT License
3.01k stars 321 forks source link

Connection closes even though it's supposed to be a persistent connection. #372

Open ccoenen opened 8 years ago

ccoenen commented 8 years ago

I have recorded this in wireshark. I am talking to an API (Rails, delivered via NGINX). For some reason, the persistent connection is not persistent:

GET /endpoint/updated_at?param1=a HTTP/1.1
Accept: application/json
Connection: Keep-Alive
Host: something.internal
User-Agent: http.rb/2.0.3
Content-Length: 0

HTTP/1.1 200 OK
Server: nginx
Date: Wed, 31 Aug 2016 22:26:59 GMT
Content-Type: application/json; charset=utf-8
Transfer-Encoding: chunked
Connection: keep-alive
X-Frame-Options: SAMEORIGIN
X-XSS-Protection: 1; mode=block
X-Content-Type-Options: nosniff
ETag: W/"5eef64b6ae16433786f47e7c554d6045"
Cache-Control: max-age=0, private, must-revalidate
X-Request-Id: 12180fc6-1c12-4b4f-998c-df42569ce6eb
X-Runtime: 0.006487

19
{"updated_at":1470393437}
0

It is a chunked response, that I can parse correctly. But Right after the response, my http.rb client will send a FIN TCP-Packet, closing the connection. This happens in Frame 47386 of this following screenshot.

http rb stream

I am not (to my knowledge) closing the connection manually. I am using the HTTP.persistent API. I am writing a line to a logfile when I create such a persistent object, There's the expected amount of lines in that logfile. However, Wireshark reports lots and lots of un-reused http connections.

Here's the bit of code, which is involved. It's wrapped into a https://github.com/mperham/connection_pool Which is being used from a Celluloid application.

        def query_api(url, query = {})
          response = @http
            .headers(accept: "application/json")
            .get(url, params: query)

          unless (200..207).cover? response.code
            raise HTTPStatusCodeError.new("Request to #{url} failed (#{response.code}): #{response.body}")
          end

          response.to_s
        rescue *TOLERATED_CONNECTION_ERRORS => e
          response.flush if response
          raise APIError.new(e.to_s)
        end

        def reinitialize_http_connection
          self.logger.info "(re-)initializing scs http connection for #{Thread.current}"
          @http = HTTP.persistent @api_url
        end

I would appreciate pointers where to start debugging.

tarcieri commented 8 years ago

These are the main triggers for closing a persistent connection:

https://github.com/httprb/http/blob/master/lib/http/client.rb#L95

Does this happen every time, or is it spurious? If it's spurious, it's probably http.rb trying to handle and gracefully recover from closed connection errors

ccoenen commented 8 years ago

As far as i can tell from wireshark, I guess this connection close happens every time. I'll look into the triggers a little later, I am on a different project this week.

tarcieri commented 8 years ago

The codepaths I linked are the only ones I'm aware of that would trigger a connection close. Perhaps you're trying to reuse a connection that's in a dirty state? (although looking at your code, I'm not sure where that would happen)

britishtea commented 7 years ago

Ran into the same issue today. Here's what I believe is happening:

  1. When the headers are set HTTP::Chainable#headers is invoked.
  2. HTTP::Chainable#headers invokes HTTP::Chainable#branch, which creates a fresh HTTP::Client and a fresh socket.
  3. The request is done using the new client.
  4. Since this new client is not referenced anywhere, the garbage collector is free to collect it and closes the socket.

Essentially a new "persistent" connection is created (and closed) for each request when using one of the HTTP::Chainable methods (save for #request which is implemented directly on HTTP::Client). To get the behaviour you want, you'll have to pass the headers to #get directly: response = @http.get(url, params: query, headers: accept: "application/json").

It would be good to document this very clearly in the wiki page on persistent connections, since it's not at all obvious this is what's going on.

justin-lavelle commented 6 years ago

@britishtea Thank you for your comment describing the issue. I just started using this lib and was getting quite frustrated trying to figure out why persistent connections where not working.

eflukx commented 5 years ago

This opens a multiple connection (in my app so many it exhausted the max file descriptors!)

conn = HTTP.persistent('http://example.com')

(1..20).map do |i|
  conn.basic_auth(:user => username, :pass => password)
      .post('/post_path/', "Number #{i}")
end

whilst this seem to work as expected:

conn = HTTP.persistent('http://example.com')
           .basic_auth(:user => username, :pass => password)

(1..20).map { |i| conn.post('/post_path/', "Number #{i}") }