ferd / dlhttpc

dispcount-based lhttpc fork for massive amounts of requests to limited endpoints
Other
36 stars 17 forks source link

Fix port leak #8

Closed loucash closed 9 years ago

loucash commented 9 years ago

@ferd Hey. If you have an option send_retry set, then in dlhttpc_client:send_request/1 (line 194) when we have an error {error, closed} we retry, so we close current socket and we create new socket.

Later we try to checkin this socket to dispcount handler, so first we change controlling process to the dispcount_watcher, and then... socket is not accepted (as it was new one). And that's a leak.

I made a test in erlang shell. I mecked dlhttpc_sock:send to receive {error, closed} when first time called. Then listed all ports before request and after request. After request there are 3 new ports, 1 responsible for inet_gethost 4 and 2 other linked to dispcount_watcher process - and there should be one.

1> application:ensure_all_started(dlhttpc).
{ok,[crypto,asn1,public_key,ssl,dispcount,dlhttpc]}
2> application:ensure_all_started(meck).
{ok,[meck]}
3> meck:new(dlhttpc_sock, [passthrough]).
ok
4> meck:new(dlhttpc_handler, [passthrough]).
ok
5> meck:expect(dlhttpc_handler, checkin, fun(A, B) -> Res = meck:passthrough([A,B]), io:fwrite(standard_io, "checkin: ~p~n", [Res]), Res end).
ok
6> meck:expect(dlhttpc_sock, send, fun(_,_,_) -> meck:expect(dlhttpc_sock, send, fun(A,B,C) -> meck:passthrough([A,B,C]) end), {error, closed} end).
ok
7> erlang:ports().
[#Port<0.0>,#Port<0.167>,#Port<0.508>,#Port<0.525>,
 #Port<0.685>]
8> dlhttpc:request("http://www.google.com", "GET", [], [], 5000, [{max_connections,1}]).
checkin: {ignore,{state,{ok,#Port<0.3933>},
                        true,
                        {"www.google.com",80,false,infinity,
                         [binary,{packet,http},{active,false}]},
                        false}}
{ok,{{302,"Found"},
     [{"Alternate-Protocol","80:quic,p=0.02"},
      {"Server","GFE/2.0"},
      {"Date","Wed, 26 Nov 2014 01:23:37 GMT"},
      {"Content-Length","261"},
      {"Location",
       "http://www.google.co.uk/?gfe_rd=cr&ei=mSt1VJyPDeTH8ge8ooLQBg"},
      {"Content-Type","text/html; charset=UTF-8"},
      {"Cache-Control","private"}],
     <<"<HTML><HEAD><meta http-equiv=\"content-type\" content=\"text/html;charset=utf-8\">\n<TITLE>302 Moved<"...>>}}
9> erlang:ports().
[#Port<0.0>,#Port<0.167>,#Port<0.508>,#Port<0.525>,
 #Port<0.685>,#Port<0.3932>,#Port<0.3934>,#Port<0.3935>]
10> recon:port_info("#Port<0.3932>").
[{meta,[{id,31456},{name,"inet_gethost 4 "},{os_pid,46531}]},
 {signals,[{connected,<0.85.0>},
           {links,[<0.85.0>]},
           {monitors,[]}]},
 {io,[{input,144},{output,63}]},
 {memory_used,[{memory,40},{queue_size,0}]},
 {type,[]}]
11> recon:port_info("#Port<0.3934>").
[{meta,[{id,31472},{name,"tcp_inet"},{os_pid,undefined}]},
 {signals,[{connected,<0.83.0>},
           {links,[<0.83.0>]},
           {monitors,[]}]},
 {io,[{input,0},{output,40}]},
 {memory_used,[{memory,40},{queue_size,0}]},
 {type,[{statistics,[{recv_oct,530},
                     {recv_cnt,10},
                     {recv_max,261},
                     {recv_avg,53},
                     {recv_dvi,20},
                     {send_oct,40},
                     {send_cnt,1},
                     {send_max,40},
                     {send_avg,40},
                     {send_pend,0}]},
        {peername,{{74,125,230,112},80}},
        {sockname,{{192,168,0,2},56827}},
        {options,[{active,false},
                  {broadcast,false},
                  {buffer,1460},
                  {delay_send,false},
                  {dontroute,false},
                  {exit_on_close,true},
                  {header,0},
                  {high_watermark,8192},
                  {keepalive,false},
                  {linger,{false,0}},
                  {low_watermark,4096},
                  {mode,binary},
                  {nodelay,false},
                  {packet,0},
                  {packet_size,...},
                  {...}|...]}]}]
12> recon:port_info("#Port<0.3935>").
[{meta,[{id,31480},{name,"tcp_inet"},{os_pid,undefined}]},
 {signals,[{connected,<0.83.0>},
           {links,[<0.83.0>]},
           {monitors,[]}]},
 {io,[{input,0},{output,0}]},
 {memory_used,[{memory,40},{queue_size,0}]},
 {type,[{statistics,[{recv_oct,0},
                     {recv_cnt,0},
                     {recv_max,0},
                     {recv_avg,0},
                     {recv_dvi,0},
                     {send_oct,0},
                     {send_cnt,0},
                     {send_max,0},
                     {send_avg,0},
                     {send_pend,0}]},
        {peername,{{74,125,230,112},80}},
        {sockname,{{192,168,0,2},56828}},
        {options,[{active,false},
                  {broadcast,false},
                  {buffer,1460},
                  {delay_send,false},
                  {dontroute,false},
                  {exit_on_close,true},
                  {header,0},
                  {high_watermark,8192},
                  {keepalive,false},
                  {linger,{false,0}},
                  {low_watermark,4096},
                  {mode,binary},
                  {nodelay,false},
                  {packet,http},
                  {packet_size,...},
                  {...}|...]}]}]
ferd commented 9 years ago

We could probably try to close the old socket with dlhttpc_sock:close(Sock, SSL) or something in case it is indeed an error. If the connection was gone, it won't be a problem, though we'd risk making it even harder to detect.

Looking at it, it seems really freaking weird for the handler to set it to {active, once} if we won't handle the messages or look for termination in handle_info. I guess it's attempted more or less implicitly through trying to change the controlling process and reconnecting, but I can't remember why I wrote things that way initially. In any case, if it doesn't cause problems, that's fine.

In any case, good catch. Let's fail hard if it goes wrong. I'll merge this -- have you tried it on your own system or you're waiting for the merge for that?

aerosol commented 9 years ago

:+1: for trying to close the old socket regardless

aerosol commented 9 years ago

@ferd we'll be using our own fork for now, no rush here

loucash commented 9 years ago

@ferd Hey, I think console logs might not be really readable, let's go step by step:

  1. We have new request, new process is spawned
  2. Checkout of resource: https://github.com/ferd/dlhttpc/blob/master/src/dlhttpc.erl#L356
  3. send_request and we get an error: {error, closed}: https://github.com/ferd/dlhttpc/blob/master/src/dlhttpc_client.erl#L194 - we close old socket, create new one
  4. Try to checkin new socket (assuming that request was successful with new socket)
    1. Successfully change controlling process
    2. Handler ignores new socket, sadness
  5. Process that was spawned in point 1 is finished
  6. dispcount_watcher receives: {'DOWN',_,process,_,normal}
  7. dispcount_watcher thinks someone lost its resource, runs dead
  8. dead creates new socket
  9. ignored socket has dispcount_watcher as an owner, but reference is lost

Old socket is closed, so there is no need to close it again. My fix is not to ignore new resource.

ferd commented 9 years ago

Yeah, I get that bit. I'm probably gonna be fine merging as is, but I wanted a bit of time figure out why I left things in {active,once}. I guess it all hangs onto the fact it assumes the client did a good job of reading everything on the line, that the server didn't dump trailing content, and that the {active, once} flag will let the worker detect if a connection has closed by polling, which makes sense, even I didn't exactly know it back then.

I'm also fine with not closing the socket as a preemptive measure. Let's crash systems that do not use the code properly and violently and see how that goes (that's what was happening right now!)