dual-stack resolution and connection strategy

awetzel commented 9 years ago

Hi Benoît, The destination ip address selection in hackney may be problematic in some dual stack environment. Since behavior in these environment is not configurable in hackney and sometimes incompatible with other libs, I will describe all of this below.

The address selection algorithm between ipv6 and ipv4 is currently the combination of https://github.com/benoitc/hackney/blob/4584f8798277ec7568e6b84e1d68ed9a8b44696d/src/hackney_connect/hackney_connect.erl#L217-L228 and https://github.com/benoitc/hackney/blob/4584f8798277ec7568e6b84e1d68ed9a8b44696d/src/hackney_client/hackney_util.erl#L72

So :

if an explicit address, v6 or v4 is given, use it
if an explicit protocol (inet6 or inet) is given use it to lookup a corresponding A or AAAA IP
else if an A record exists, connect to the corresponding ipv4 address
else if a AAAA exists, connect to the ipv6

So v4 are chosen over v6 if both are available.

Lets make a little sum up of the specs, in https://tools.ietf.org/html/rfc4213 :

the resolver library MAY order the results returned to the application in order to influence the version of IP packets used to communicate with that specific node -- IPv6 first, or IPv4 first. The applications SHOULD be able to specify whether they want IPv4, IPv6, or both records

The actual ordering mechanisms are out of scope of this memo. Address selection is described at more length in [RFC3484].

last version of RFC3484 is actually http://www.rfc-editor.org/rfc/rfc6724.txt. This ordering is most often only use for ipv6 addresse selection, but the rfc says that it could be used for ipv4 using ipv4 mapped addresses.

Prefix Precedence Label ::1/128 50 0 ::/0 40 1 ::ffff:0:0/96 35 4 ...

Is the default ordering rule in the RFC : ipv6 (::/0) must be prefered to ipv4 (::ffff:0:0/96) in the dns resolution result order (so the resolver should suggest to the application by this ordering to use the ipv6 in priority in a dual stack context).

But as written in rfc4213 the resolver only propose IPs and preferences, but the choice is still done by the application. (but resolver rfc still suggest that ipv6 sould be prefered to ipv4 in dual-stack env) In addition with this choice, network libs often implement fallback strategies in case of partial service outage (one stack down).

the strategy of "ipv4" first (hackney's current) is used by old libraries for backward compatibility. for instance see Java JDK http://download.java.net/jdk7/archive/b123/docs/api/java/net/doc-files/net-properties.html#Ipv4IPv6

the default behavior is to prefer using IPv4 addresses over IPv6 ones. This is to ensure backward compatibility

for service outage most library offers the feature of "fallback" from v6 to v4 : for instance the unloved httpc conf http://erlang.org/doc/man/httpc.html : IpFamily = inet | inet6 | inet6fb4

By default inet. When it is set to inet6fb4 you can use both ipv4 and ipv6. It first tries inet6 and if that does not works falls back to inet. The option is here to provide a workaround for buggy ipv6 stacks to ensure that ipv4 will always work.

but to handle this choice and fallback effectively, an algorithm has been created and RFC has been published : https://tools.ietf.org/html/rfc6555 (happy Eyeballs). (this algo allows the fastest stack to be used in priority)

The libs I personally use in concurrence with hackney implements happy eyeballs (mainly CURL). The problem user can observe is a partial outage for the part of a stack using curl and a total outage if another part use hackney. (httpc behaviour with v6fb4 is slow but less problematic).

Also if the ipv6 stack is faster than the ipv4 one - for instance in an ipv6 only network with NAT64 - then hackney will be slower and need an explicit configuration inet6 to force ipv6 and avoid that. But what if the service became only available in ipv4 ? then static inet6 will broke connections. The same in an ipv6 only network were an extra ipv4 resolve query will be always needed if inet6 is not hard coded.

So my point is that with current hackney choice, the only acceptable solution seems to hard-code inet6 when you use ipv6 to avoir any overhead, which may be very cumbersome and cause many issues if the ip stack of the targeted service can change. (this is the point of happy Eyeballs)

So finally :

it would be nice to have a happy eyballs strategy, but as I cannot help you next months and propose a PR it is not a good proposal :)
current behavior is only ok for backward compatibility, which is understandable, but it would be nice if the opposite behavior (v6 fallback v4) were configurable for dns resolution (instead of only hardcoded inet or inet6).
even better if a configuration v6 fallback v4 allows this strategy for connections (fallback when connection fails) to handle single stack service outage.

Sorry for the long issue - make it what you want :) and have a good day.

Arnaud

PS: People saying that the switch to ipv6 is easy do not use ipv6 :) : software feature which are not used massively are rarely working well and never handle edge cases - which is the case of ipv6 features most of the time. Even network intensive app like haproxy handles ipv6 only partially. Erlang which is also network centric handles v6 also partially ( https://github.com/erlang/otp/pull/602 ).

benoitc commented 9 years ago

@awetzel Thanks for the ticket and the description of the issue it helps :)

I had a look at the simple happy eyeballs strategy and it seems pretty easy to implement. So let's add it for the next release. I will also propose a way to force to ipv6 or ipv4 when needed.

awetzel commented 9 years ago

thanks a lot to take it into consideration, your work is very helpful so happy to help a little. I missed it but there is an erlang implementation referenced in the RFC : http://www.viagenie.ca/news/index.html#happy_eyeballs_erlang

telmich commented 2 years ago

What is needed to move forward on this one? This is a blocker for using https://pleroma.social/ in ipv6only environments.

benoitc commented 2 years ago

@telmich there is a work in progress that should land soon with the new pool.

benoitc commented 2 months ago

work in progress: https://github.com/benoitc/hackney/pull/737 cc @ruslandoga

benoitc / hackney

dual-stack resolution and connection strategy #206