cmullaparthi / ibrowse

Erlang HTTP client
Other
516 stars 190 forks source link

[bug] Return {error,{conn_failed,{error,econnrefused}}} on master #162

Open define-null opened 6 years ago

define-null commented 6 years ago

I noticed that on latest master if there is a web server running on the local machine:

>ibrowse:send_req("http://localhost:8091/", [], get, []).
{error,{conn_failed,{error,econnrefused}}}

But calling it with 127.0.0.1 host instead would work fine.

This same behaviour is not reproducible on v4.4.0.

lucas-nelson commented 6 years ago

see #159

We've just hit this after upgrading to 4.4.1. We have our devs setup with http://localhost:port/.... for our services to talk to each other. They are elixir / phoenix services.

After that change in 555f707, ibrowse now prefers the ipv6 address ::1 and things stop talking to each other.

The problem goes away if we use 127.0.0.1.

The problem goes away if you remove the ::1 localhost line from /etc/hosts.

lucas-nelson commented 6 years ago

It's probably important to note I am seeing this behaviour in MacOS 10.13.6

arpunk commented 6 years ago

I've been able to hit this bug after updating to 4.4.1 from 4.4.0. Changing to 127.0.0.1 fixes it as @lucas-nelson suggested. I'm using Erlang/OTP 20.x/21.x and Elixir 1.6/1.7. I'm getting this error in GNU/Linux.

markan commented 6 years ago

I'm also seeing something somewhat similar. After updating to 4.4.1 we started seeing connection failures of the form {error,{conn_failed,{error,eaddrnotavail}}} when using a hostname to connect from an IPv4 only system.

I think PR #155 is incomplete, in that it uses the availability of an IPv6 address to decide what to do. A host being resolvable as IPv6 doesn't guarantee it being routable. So that change breaks things for people with IPv4 only connectivity; connections fail and AFAIK there's not a great workaround for it.

Reverting the PR isn't appealing; I don't want to break IPv6 only environments either. So we need some kind of fix. I'm willing to take a shot at it, and a couple of options occur to me.

1) we could retry back to IPv4 after trying IPv6. Somewhat messy to plumb in, but the result 'just works'. Many browsers do this, but not everyone is a fan of that strategy. Among other things it can be slow depending on how long it takes IPv6 to fail. 2) provide a connection option to coerce to IPv4.

Thoughts?

Here's the debugging to support my theory of what's going on. First, a little bit of poking with redbug reveals that we're using the inet6 option for sockets:

3> redbug:start("ibrowse_http_client:get_sock_options->return", []).
{58,1}
4> ibrowse:send_req("http://jigsaw.w3.org/", [], get, [], []).            
{error,{conn_failed,{error,eaddrnotavail}}}
5>    

% 18:20:54 <0.605.0>({ibrowse_http_client,init,1})
% ibrowse_http_client:get_sock_options("jigsaw.w3.org", [], [])

% 18:20:54 <0.605.0>({ibrowse_http_client,init,1})
% ibrowse_http_client:get_sock_options/3 -> [{nodelay,true},
                                             binary,
                                             {active,false},
                                             inet6]

7> redbug:start("gen_tcp:connect->return", []).
{58,2}
8> ibrowse:send_req("http://jigsaw.w3.org/", [], get, [], []).    

% 18:22:41 <0.624.0>({ibrowse_http_client,init,1})
% gen_tcp:connect("jigsaw.w3.org", 80, [{nodelay,true},binary,{active,false},inet6], 30000)
{error,{conn_failed,{error,eaddrnotavail}}}
% 18:22:41 <0.624.0>({ibrowse_http_client,init,1})
% gen_tcp:connect/4 -> {error,eaddrnotavail}

Removing the inet6 option makes the call suceed

10> gen_tcp:connect("jigsaw.w3.org", 80, [{nodelay,true},binary,{active,false},inet6], 30000).
{error,eaddrnotavail}
11> gen_tcp:connect("jigsaw.w3.org", 80, [{nodelay,true},binary,{active,false}], 30000).      
{ok,#Port<0.58094>}

As you might expect, this option is set because the gethostbyname call returns an ipv6 address

18> redbug:start("inet:gethostbyname->return", []).       
{58,3}
19> ibrowse:send_req("http://jigsaw.w3.org/", [], get, [], []).   

% 18:48:20 <0.669.0>({ibrowse_http_client,init,1})
% inet:gethostbyname("jigsaw.w3.org", inet6)

% 18:48:20 <0.669.0>({ibrowse_http_client,init,1})
% inet:gethostbyname/2 -> {ok,{hostent,"jigsaw.w3.org",[],inet6,16,
                                  [{9731,16394,65535,2052,32798,52,0,21}]}}

So it looks like we can connect to jigsaw via v4, but not v6. To confirm:

mark@souschef{310}% dig jigsaw.w3.org -t aaaa

; <<>> DiG 9.10.3-P4-Ubuntu <<>> jigsaw.w3.org -t aaaa
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 11283
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;jigsaw.w3.org.         IN  AAAA

;; ANSWER SECTION:
jigsaw.w3.org.      1797    IN  AAAA    2603:400a:ffff:804:801e:34:0:15

;; Query time: 4 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Fri Sep 14 18:51:36 PDT 2018
;; MSG SIZE  rcvd: 70

mark@souschef{311}% ping6 2603:400a:ffff:804:801e:34:0:15
connect: Cannot assign requested address
mark@souschef{312}% ping6 jigsaw.w3.org                  
connect: Cannot assign requested address
mark@souschef{313}% ping jigsaw.w3.org 
PING jigsaw.w3.org (128.30.52.21) 56(84) bytes of data.
64 bytes from sinope.w3.org (128.30.52.21): icmp_seq=1 ttl=46 time=76.0 ms
^C
--- jigsaw.w3.org ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 76.072/76.072/76.072/0.000 ms
jfayad commented 6 years ago

Same issue here, reverting back to 4.4.0 solves the issue for me:

Any request I try to make always return ehostunreach on my machine (MacOS 10.13.6) and eaddrnotavail on my test env. (docker container ran by a gitlab ci runner using this image bitwalker/alpine-elixir-phoenix:1.7.3)

cmullaparthi commented 6 years ago

Thanks for all the bug reports. I think option (2) suggested by @markan is the best way forward. I'll try and come up with an emergency patch.

cmullaparthi commented 6 years ago

I think to preserve backwards compatibility, I will add an option 'prefer_ipv6'. If this is not supplied, it will default to using ipv4.

jfayad commented 6 years ago

default to ipv4 would be great @cmullaparthi thanks!

cmullaparthi commented 6 years ago

I've pushed a commit to master. I meant to push it to a branch for everyone to try but fat fingers...

Default behaviour without any options specified and connecting to a IPv4 host

11> ibrowse:send_req("http://ipv4.test-ipv6.noroutetohost.net/", [], get, [], []).                   
(<0.221.0>) call gen_tcp:connect("ipv4.test-ipv6.noroutetohost.net",80,[{nodelay,true},binary,{active,false}],30000) ({ibrowse_http_client,
                                                                                                                       send_req_1,
                                                                                                                       8})
(<0.221.0>) returned from gen_tcp:connect/4 -> {ok,#Port<0.46206>}
{ok,"200",
    [{"Date","Fri, 21 Sep 2018 06:14:34 GMT"},
     {"Server","Apache/2.4.25"},
     {"Content-Location","index.html.en_US"},
     {"Vary","negotiate,accept-language,accept-encoding"},
     {"TCN","choice"},
     {"Last-Modified","Fri, 21 Sep 2018 03:36:59 GMT"},
     {"ETag","\"576595a1f3bd3\""},
     {"Accept-Ranges","bytes"},
     {"Content-Length","30935"},
     {"Content-Type","text/html; charset=utf-8"},
     {"Content-Language","en-us"}],
...

Default behaviour without any options specified and connecting to a IPv6 host

12> ibrowse:send_req("http://ipv6.test-ipv6.noroutetohost.net/", [], get, [], []).                   
{error,{conn_failed,{error,nxdomain}}}

Behaviour with the new option to prefer IPv6 specified and connecting to a IPv6 host

13> ibrowse:send_req("http://ipv6.test-ipv6.noroutetohost.net/", [], get, [], [{prefer_ipv6, true}]).
(<0.218.0>) call gen_tcp:connect("ipv6.test-ipv6.noroutetohost.net",80,[{nodelay,true},binary,{active,false},inet6],30000) ({ibrowse_http_client,
                                                                                                                             send_req_1,
                                                                                                                             8})
(<0.218.0>) returned from gen_tcp:connect/4 -> {ok,#Port<0.46205>}
{ok,"200",
    [{"Date","Fri, 21 Sep 2018 06:12:11 GMT"},
     {"Server","Apache/2.4.25"},
     {"Content-Location","index.html.en_US"},
     {"Vary","negotiate,accept-language,accept-encoding"},
     {"TCN","choice"},
     {"Last-Modified","Fri, 21 Sep 2018 03:36:59 GMT"},
     {"ETag","\"576595a1f3bd3\""},
     {"Accept-Ranges","bytes"},
     {"Content-Length","30935"},
     {"Content-Type","text/html; charset=utf-8"},
     {"Content-Language","en-us"}],
...
Neustradamus commented 4 years ago

Any news on this ticket?

arpunk commented 4 years ago

Any news on this ticket?

Not that I'm aware of. We moved on to a different HTTP client.