cmullaparthi / ibrowse

Erlang HTTP client
Other
516 stars 190 forks source link

infinity timeout broken (on sel_conn_closed) #149

Closed alexvazquezfente closed 8 years ago

alexvazquezfente commented 8 years ago

I hit this bug when downloading files using ibrowse v4.2 (although the bug is also present in master). To reproduce it, you just have to specifiy an infinity timeout to ibrowse:send_req/6 and (this is the difficult part to reproduce it without tight control on the server) the server must close the connection.

I was using these parameters:

ibrowse:send_req("https://samples.ffmpeg.org/V-codecs/h264/indiana_jones_4-tlr3_h640w.mov", [], get, [],
                 [{http_vsn, {1, 1}},
                  {connect_timeout, 5000},
                  {max_sessions, 2},
                  {inactivity_timeout, 120000},
                  {max_pipeline_size, 2},
                  {save_response_to_file, {append, "/tmp/file"}}],
                 infinity).

It can fail with an error badarith if the server closes the connection (at least it is my guess looking at the ibrowse code). This is the stacktrace:

[{ibrowse,try_routing_request,14,[{file,"ibrowse.erl"},{line,371}]},
 {hls_http_downloader,save_to_file,4,
                      [{file,"hls_http_downloader.erl"},{line,157}]},
 {srv_ds_cb,'-download_simultaneous/1-fun-0-',1,
            [{file,"src/srv_ds_cb.erl"},{line,386}]},
 {lists,foreach,2,[{file,"lists.erl"},{line,1336}]},
 {srv_ds_cb,download_simultaneous,1,[{file,"src/srv_ds_cb.erl"},{line,382}]},
 {srv_ds_cb,download,1,[{file,"src/srv_ds_cb.erl"},{line,376}]},
 {srv_ds_cb,exec_cap,2,[{file,"src/srv_ds_cb.erl"},{line,187}]},
 {srv_ds_cb,call,3,[{file,"src/srv_ds_cb.erl"},{line,60}]}]

The problems are in the ibrowse:try_routing_request/14 when do_send_req returns {error, sel_conn_closed}. Several invalid calculations calculations are done between the lines 387 and 390 with the Ori_timeout variable which will fail if it has the special value infinity.

I don't provide a patch because i don't exactly know what is the expected behaviour for some of these logic in the infinity timeout case.

By the way, i also don't understand the need to call trunc after round but that is another issue ;)

cmullaparthi commented 8 years ago

Hi, thanks for the bug report. The following patch to HEAD of master should fix the problem. The idea behind the try_routing_request function is that ibrowse will retry a request if the connection on which it has chosen to send the request is closed before it can be sent. It will retry if there is enough time left (Timeout value specified), or if the Timeout value is infinity it will be retried 3 times (by default).

$ git diff
diff --git a/src/ibrowse.erl b/src/ibrowse.erl
index b243f36..2876434 100644
--- a/src/ibrowse.erl
+++ b/src/ibrowse.erl
@@ -381,6 +381,13 @@ try_routing_request(Lb_pid, Parsed_url,
         {ok, {_Pid_cur_spec_size, _, Conn_Pid}} ->
             case do_send_req(Conn_Pid, Parsed_url, Headers,
                              Method, Body, Options_1, Timeout) of
+                {error, sel_conn_closed} when Timeout == infinity ->
+                    try_routing_request(Lb_pid, Parsed_url,
+                                        Max_sessions, 
+                                        Max_pipeline_size,
+                                        {SSLOptions, IsSSL}, 
+                                        Headers, Method, Body, Options_1,
+                                        Ori_timeout, Ori_timeout, Req_start_time, Max_attempts, Try_count + 1);
                 {error, sel_conn_closed} ->
                     Time_now = os:timestamp(),
                     Time_taken_so_far = trunc(round(timer:now_diff(Time_now, Req_start_time)/1000)),

By the way, i also don't understand the need to call trunc after round but that is another issue ;)

I'm not sure either, it has been there a long time :-)

alexvazquezfente commented 8 years ago

Looks good to me.

I don't know how to reproduce the problem in a repeatable way, I think I was very unlucky (actually lucky) to hit the issue. So, I have tested with the help of a mocking library and the fix does solve the issue as expected (by returning {error,retry_later}).

Thank you.