Docker Swarm dnsrr mode dns resolve error

Livenux commented 3 years ago

On a 4-node docker swarm cluster, a two-instance service was released using dnsrr mode. I found that one of the dns resolutions was wrong. When using multiple pings, the dns resolved to a container of another service.

Steps to reproduce the issue:

deploy a replicas 2，endpoint_mode: dnsrr service
ping docker service name.

Describe the results you received: ping example-service 64 bytes from example-service.1.xxxxx .... ping example-service 64 bytes from another-service.2.xxxx

Describe the results you expected: ping example-service 64 bytes from example-service.1.xxxxx .... ping example-service 64 bytes from example-service.2.xxxx

Additional information you deem important (e.g. issue happens only occasionally):

Output of docker version:

Client: Docker Engine - Community
 Version:           19.03.9
 API version:       1.40
 Go version:        go1.13.10
 Git commit:        9d988398e7
 Built:             Fri May 15 00:25:27 2020
 OS/Arch:           linux/amd64
 Experimental:      false

Server: Docker Engine - Community
 Engine:
  Version:          19.03.9
  API version:      1.40 (minimum version 1.12)
  Go version:       go1.13.10
  Git commit:       9d988398e7
  Built:            Fri May 15 00:24:05 2020
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.2.6
  GitCommit:        894b81a4b802e4eb2a91d1ce216b8817763c29fb
 runc:
  Version:          1.0.0-rc8
  GitCommit:        425e105d5a03fabd737a126ad93d62a9eeede87f
 docker-init:
  Version:          0.18.0
  GitCommit:        fec3683

Output of docker info:

Client:
 Debug Mode: false

Server:
 Containers: 27
  Running: 8
  Paused: 0
  Stopped: 19
 Images: 43
 Server Version: 19.03.9
 Storage Driver: overlay2
  Backing Filesystem: xfs
  Supports d_type: true
  Native Overlay Diff: true
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: active
  NodeID: w76ek2fk07o15jn61pqdpter7
  Is Manager: true
  ClusterID: kvs9ffq7ndnxavaz8ydbypdb9
  Managers: 4
  Nodes: 4
  Default Address Pool: 172.29.0.0/16  
  SubnetSize: 24
  Data Path Port: 4789
  Orchestration:
   Task History Retention Limit: 5
  Raft:
   Snapshot Interval: 10000
   Number of Old Snapshots to Retain: 0
   Heartbeat Tick: 1
   Election Tick: 10
  Dispatcher:
   Heartbeat Period: 5 seconds
  CA Configuration:
   Expiry Duration: 10 years
   Force Rotate: 0
  Autolock Managers: false
  Root Rotation In Progress: false
  Node Address: 10.200.117.9
  Manager Addresses:
   10.200.117.10:2377
   10.200.117.11:2377
   10.200.117.8:2377
   10.200.117.9:2377
 Runtimes: runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 894b81a4b802e4eb2a91d1ce216b8817763c29fb
 runc version: 425e105d5a03fabd737a126ad93d62a9eeede87f
 init version: fec3683
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 3.10.0-1127.8.2.el7.x86_64
 Operating System: CentOS Linux 7 (Core)
 OSType: linux
 Architecture: x86_64
 CPUs: 16
 Total Memory: 31.5GiB
 Name: wlapp-2.novalocal
 ID: 3ALL:WBWE:UFCX:4DW3:Q2HJ:3BO4:F445:VR6O:HSUE:RDO2:Q4C4:LXWD
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

user@fffaeb1e7c29:~$ ping example-service
PING example-service (172.29.4.4) 56(84) bytes of data.
64 bytes from prod_example-service.1.9b707n20gzwnxdayr1j8csls5.prod_default (172.29.4.4): icmp_seq=1 ttl=64 time=0.605 ms
64 bytes from prod_example-service.1.9b707n20gzwnxdayr1j8csls5.prod_default (172.29.4.4): icmp_seq=2 ttl=64 time=0.617 ms
^C
--- example-service ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1ms
rtt min/avg/max/mdev = 0.605/0.611/0.617/0.006 ms
user@fffaeb1e7c29:~$ ping example-service
PING example-service (172.29.4.77) 56(84) bytes of data.
64 bytes from prod_gateway.2.xi951q7xw10r8lbddud4rtxia.prod_default (172.29.4.77): icmp_seq=1 ttl=64 time=0.109 ms
64 bytes from prod_gateway.2.xi951q7xw10r8lbddud4rtxia.prod_default (172.29.4.77): icmp_seq=2 ttl=64 time=0.081 ms
^C
--- example-service ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1ms
rtt min/avg/max/mdev = 0.081/0.095/0.109/0.014 ms
user@fffaeb1e7c29:~$ cat /etc/resolv.conf 
search openstacklocal novalocal
nameserver 127.0.0.11
options ndots:0
user@fffaeb1e7c29:~$ exit

Additional environment details (AWS, VirtualBox, physical, etc.):

laxmanpradhan commented 3 years ago

can you try dig example-service to see the actual DNS entry on the docker engine DNS? (you may need to use something like an ubuntu container and then apt install dnsutils to install dig). This seems similar to the issue I am having where the DNS entry on the docker engine DNS server is incorrect. #41766

In the linked issue, the A record is always off by minus 1. ie, the actual container IP 10.0.4.8 is listed as 10.0.4.7 in the DNS record. Is the DNS record constantly changing for you?

Livenux commented 3 years ago

can you try dig example-service to see the actual DNS entry on the docker engine DNS? (you may need to use something like an ubuntu container and then apt install dnsutils to install dig). This seems similar to the issue I am having where the DNS entry on the docker engine DNS server is incorrect. #41766

In the linked issue, the A record is always off by minus 1. ie, the actual container IP 10.0.4.8 is listed as 10.0.4.7 in the DNS record. Is the DNS record constantly changing for you?

It may be that when the number of docker swarm service instances was 3, three dns records were recorded, and when I changed the number of instances to 2, the dns records were not deleted, and this IP was occupied by other services, resulting in dns parsing errors.

dig example-service

; <<>> DiG 9.11.5-P4-5.1+deb10u2-Debian <<>> example-service
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 49236
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;redis-proxy.           IN  A

;; ANSWER SECTION:
example-service.        600 IN  A   172.29.4.155
example-service.        600 IN  A   172.29.4.77
example-service.        600 IN  A   172.29.4.154

;; Query time: 0 msec
;; SERVER: 127.0.0.11#53(127.0.0.11)
;; WHEN: Tue Dec 15 17:10:19 CST 2020
;; MSG SIZE  rcvd: 110

update: When the docker swarm service is in dnsrr mode, the dns record will be n+1 of the number of instances, and one of the dns records is wrong.

laxmanpradhan commented 3 years ago

@Livenux I discovered that the IP address of the service itself will be one minus the container IP. You can see the service vitrual IP by using docket network inspect -v . The -v is required for verbose mode to see the service VIP. Does the DNS record from dig match up with the service IP?

Livenux commented 3 years ago

@Livenux I discovered that the IP address of the service itself will be one minus the container IP. You can see the service vitrual IP by using docket network inspect -v . The -v is required for verbose mode to see the service VIP. Does the DNS record from dig match up with the service IP?

is not a service VIP. is a dns cache（The ipvs mode should be a dns record, and dnsrr should be the number of instances of service, right?）. I remove the dnsrr service, recreate new same name ipvs mode service, The wrong IP is still be resolved to the newly created service。

rmillet-rs commented 1 month ago

Hello,

I have a similar issue with docker 24.0.6 (this is not the latest, but haven't found anything related to this in recent changelogs).

I have many services (backend) with DNS RR resolution mode. The backend services have 2 replicas. Services are sometime updated (so containers are created/destroyed).

We have connection issues from others services (reverse proxy) to some of these (backend): delay added because of timeout trying to connect to some containers IP before falling back to another IP.

While debugging (docker exec on reverse proxy), we found that more IP addresses than the number of containers are returned by the internal resolver (sometime even 4 IP addresses are returned):

getent ahosts a-problematic-backend-service
172.31.158.180  STREAM a-problematic-backend-service
172.31.158.180  DGRAM  
172.31.158.180  RAW    
172.31.156.25   STREAM 
172.31.156.25   DGRAM  
172.31.156.25   RAW    
172.31.159.84   STREAM 
172.31.159.84   DGRAM  
172.31.159.84   RAW

While backend services without issue would only return 2 IP addresses.

To resolve the issue, I tried (without success):

scaling up then down (or down to 0, then up)
removing stopped containers of these service

Is there a way to force docker internal DNS resolver to do a re-synchronization or other workaround to make it forget the wrong IP addresses (without deleting/recreating the services)?

moby / moby

Docker Swarm dnsrr mode dns resolve error #41744