igrigorik / em-http-request

Asynchronous HTTP Client (EventMachine + Ruby)
1.22k stars 220 forks source link

empty body on GET responses that Upgrade #333

Closed doriantaylor closed 5 years ago

doriantaylor commented 5 years ago

Currently using em-http 1.1.5 to write something to rip through and harvest metadata from a bunch of links and it silently failed on one of them (hi @aristus), so I broke it out into a copy of the getting started example to see it would replicate:

require 'eventmachine'
require 'em-http'

EventMachine.run {
  http = EventMachine::HttpRequest.new('http://carlos.bueno.org/about.html').get

  http.errback { p 'Uh oh'; EM.stop }
  http.callback {
    p http.response_header.status
    p http.response

    EventMachine.stop
  }
}

Sure enough it does; it returns:

200
""

Wireshark says the data goes over the connection as expected. Response headers are

HTTP/1.1 200 OK
Date: Sat, 15 Jun 2019 00:39:56 GMT
Server: Apache
Upgrade: h2
Connection: Upgrade, close
Last-Modified: Fri, 13 May 2016 00:27:58 GMT
ETag: "10d4-532ae58bc7f80"
Accept-Ranges: bytes
Content-Length: 4308
Vary: Accept-Encoding
Content-Type: text/html

The only thing slightly unusual is the Upgrade header and Connection: Upgrade, close, which seems to replicate an empty body when I enable mod_http2 on my own Apache. Not sure where to look for a diagnosis, but again this is 1.1.5.

doriantaylor commented 5 years ago

okay it appears the bug is somewhere in between Http::Parser and the C code, unless there is some configuration that can be twiddled. It seems like on_body is never called if there's an Upgrade: header.

doriantaylor commented 5 years ago

ah yes looks like this squarely belongs to http_parser.rb. neeeeeeever mind!

igrigorik commented 5 years ago

👍

On Fri, Jun 14, 2019, 11:38 PM Dorian Taylor notifications@github.com wrote:

ah yes looks like this squarely belongs to http_parser.rb https://github.com/tmm1/http_parser.rb. neeeeeeever mind!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/igrigorik/em-http-request/issues/333?email_source=notifications&email_token=AAACTHBENMUP6ZNETL3I3QLP2SE7DA5CNFSM4HYODPZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODXYRVJI#issuecomment-502340261, or mute the thread https://github.com/notifications/unsubscribe-auth/AAACTHET33ZP56JDCFMODQTP2SE7DANCNFSM4HYODPZA .

doriantaylor commented 5 years ago

Heads up, I compiled Http::Parser against the latest Joyent parser and it fixed my issue but yours will probably need some massaging around keepalive connections and protocol upgrading, as I'm noticing EPROTO errors on some SSL connections but not others when I have keepalive turned on. Good to keep an eye out when they release a fix.