Open jcurtis789 opened 1 year ago
As per https://github.com/docker/cli/issues/4296#issuecomment-1553442163, I would say this is by design; any sort of fallback behavior would have to be implemented as part of the daemon itself (or in distribution/distribution), and given that the distribution spec is silent on what to do here, I don't think introducing a new behavior is the right move.
That's fair, I guess perhaps I'm looking for guidance on my current setup then.
Using dockerhub as an example, my current understanding is that if one of the three following records becomes unavailable, image pulls for a large number of folks would start failing.
> nslookup registry.hub.docker.com
Non-authoritative answer:
Name: registry.hub.docker.com
Address: 52.1.184.176
Name: registry.hub.docker.com
Address: 18.215.138.58
Name: registry.hub.docker.com
Address: 34.194.164.123
Is this assumption accurate? How would you recommend mitigating against this possible failure scenario? I'm leaning towards putting both of my servers behind a load balancer that is capable of performing health checks but that wouldn't solve the issue stated above.
I think in general, if an IP 'must' respond without fail, the solution is anycast routing. There are other options like low TTLs and DNS trickery as well; the main objective for a highly available registry is that the DNS name must resolve to an IP where a currently functional registry can be located.
the main objective for a highly available registry is that the DNS name must resolve to an IP where a currently functional registry can be located
So I think this is the crux of my issue with how the Docker CLI currently behaves. Any production-ready registries, including those available over the internet, will employ a round robin DNS configuration for their domain. Using my example above, registry.hub.docker.com resolves to three distinct IP addresses.
nslookup registry.hub.docker.com Non-authoritative answer: Name: registry.hub.docker.com Address: 52.1.184.176 Name: registry.hub.docker.com Address: 18.215.138.58 Name: registry.hub.docker.com Address: 34.194.164.123
In it's current state, one-third of 'docker pull/push' commands would fail if the facility that contains 34.194.164.123 were to catch on fire. These failures would continue to occur until either 1) The DNS record for registry.hub.docker.com were updated to only include the two "good" IP addresses and the clients received the DNS update or 2) The fire is put out and service is restored.
Apache HTTPComponents 4+ (Java) resolved this issue by implementing retry logic around connection timeout parameters. Perhaps my inquiry goes deeper into underlying Go libraries that are being used. I'd love to hear your thoughts on my hypothetical scenario.
That is by design -- all IP addresses are treated as equal by most software. Going out of your way to try and retry another response from the DNS packet is pretty involved, and I would generally go so far as to call it an anti-feature.
This is how libc does DNS. resolv.conf has the same semantics (all nameservers are treated equally; you can't "fall through" to a second nameserver because the first one return an error; musl libc goes so far as to do lookups in parallel and pick the first to return).
Hmm, I might have to take that back -- apparently the thinking in this area has moved on from gethostbyname()
-- specifically I found https://www.rfc-editor.org/rfc/rfc6724#section-2, which states:
As a consequence, we intend that implementations of APIs such as getaddrinfo() will use the destination address selection algorithm specified here to sort the list of IPv6 and IPv4 addresses that they return. Separately, the IPv6 network layer will use the source address selection algorithm when an application or upper layer has not specified a source address. Application of this specification to source address selection in an IPv4 network layer might be possible, but this is not explored further here.
Well-behaved applications SHOULD NOT simply use the first address returned from an API such as getaddrinfo() and then give up if it fails. For many applications, it is appropriate to iterate through the list of addresses returned from getaddrinfo() until a working address is found. For other applications, it might be appropriate to try multiple addresses in parallel (e.g., with some small delay in between) and use the first one to succeed.
net.Dial
also claims:
When using TCP, and the host resolves to multiple IP addresses, Dial will try each IP address in order until one succeeds.
There might be a little more to this than my very systems-C biased experience/recollection indicates.
I'm curious as to what @corhere thinks, and I'll need to spend some time figuring out what the Go stdlib actually intends to do (if it's not trying to do a simple gethostbyname()
).
Thanks :-)
That is by design -- all IP addresses are treated as equal by most software. Going out of your way to try and retry another response from the DNS packet is pretty involved, and I would generally go so far as to call it an anti-feature.
This is how libc does DNS. resolv.conf has the same semantics (all nameservers are treated equally; you can't "fall through" to a second nameserver because the first one return an error; musl libc goes so far as to do lookups in parallel and pick the first to return).
I just wanted to clarify one point which I don't think I correctly articulated in my original post:
I 100% agree that a retry should not be attempted if a server returns an error response (4xx, etc.). It is when a connection to the attempted server can not even be established where I would expect another address to be attempted.
The documentation for net.Dial
does appear to be correct insofar as it does try all resolved addresses, so we already have DNS failover at the transport layer for distribution requests. I have strong doubts that it would be appropriate to fail over to the next resolved address on HTTP 502 or for any other reason after the transport-layer connection has been successfully established. As far as I can tell the HTTP RFCs are silent on the matter of transport establishment so RFC 6724 would seem to apply. Given that RFC 6724 talks about "success" and "failure" in the context of connect()
/sendto()
/bind()
, I believe that "success" means that a connection is established.
Failing over to the next address because the server at one address is responsive but otherwise unable to complete the HTTP request to the client's satisfaction—whatever the reason—could be bad for server operators. In the event that the server is responding with failures because it is overloaded, failing over to the next DNS record would merely compound the problem by overloading the fallback server with a thundering herd of retried requests.
This entire time I've been reasoning in terms of the transport layer, and the successful establishment of a TCP connection. I very much agree that the application layer should have no implications for the connection semantics.
However, if our HTTP client is already retrying failed TCP handshakes with the next IP returned by DNS, it sounds like there might only be a docs issue here in the end.
This issue is easily reproduced on my end by setting up a round robin DNS entry configured with one "good" server and one "bad" server. The docker pull/push commands time out after 15 seconds until the "bad" server is removed from the DNS entry after which the commands work without issue.
nslookup www.my-private-repo.com Address 1.2.3.4 Address 5.6.7.8
nslookup1.2.3.4 name = my-fake-server
nslookup 5.6.7.8 name = my-real-server
docker pull www.my-private-repo.com/my-image:1.0
If this sort of configuration is working for both of you then perhaps it's the fact that I have an old-ish Docker binary (version 20.10.14 / go1.16.15)?
@jcurtis789 what's bad about the "bad" server in your tests?
In this particular scenario, the server is powered down so nothing is listening on port 80. I can also replicate by simply shutting down the HA Proxy instance in front of Artifactory, which accomplishes the same thing.
Restoring power/starting up HA Proxy fixes the problem.
https://github.com/moby/moby/blob/f5106148e333be4ad92fc6c9b9a30b0ff1e96f8d/registry/registry.go#L167-L170 In the case of the registry domain name resolving to two addresses, the dialer will fail over to the second address after fifteen seconds.
https://github.com/moby/moby/blob/f5106148e333be4ad92fc6c9b9a30b0ff1e96f8d/registry/auth.go#L121-L124
After fifteen seconds, the http.Client
timeout expires and the request fails, defeating the net.Dialer
failover. Looks like a bug!
Nice find - thanks! From my perspective, 15 seconds is an order of magnitude higher than I would expect to wait for a connection to get established. Just my two cents, but I would expect that value to be 1-2 seconds at most to further prevent a degradation of service.
Thanks again both of you for your time in investigating :-)
possible duplicates;
Description
Hello,
I have an FQDN configured to an on-premise registry such that two servers answer to the FQDN (i.e. www.my-private-repo.com). These servers run a JFROG Artifactory repo, although I believe that to be insignificant for this issue.
When both servers are healthy, performing a docker push/docker pull works flawlessly. However, I noticed that if one server goes down (maintenance, otherwise) these operations fail.
Reproduce
To reproduce, I have configured a custom FQDN where one DNS entry is one of the real servers hosting my image repository and another is a real DNS record that resolves to a fake server.
nslookup www.my-private-repo.com ... Address 1.2.3.4 ... Address 5.6.7.8
nslookup1.2.3.4 ... name = my-fake-server
nslookup 5.6.7.8 ... name = my-real-server
A 'docker pull' times out after 15 seconds. Throwing --debug doesn't provide any more information. When my-fake-server is removed from www.my-private-repo.com, pulls begin working again.
Expected behavior
I would expect the Docker CLI to realize one of the DNS entries is faulty (i.e. my-fake-server) via connection timeout, 502 or otherwise and attempt the request on another entry (i.e. my-real-server). Perhaps I am missing a configuration to do so.
Dockerhub (registry.hub.docker.com) DNS resolves to three separate IPs. If one of these were to become unavailable, I would expect similar behavior, but more widespread issues across the community unless I'm perhaps missing something.
docker version
docker info
Additional Info
Thank you!