skynetservices / skydns

DNS service discovery for etcd
MIT License
2.2k stars 307 forks source link

NXDOMAIN Redirection causing intermittent issues #352

Open derekkl opened 4 years ago

derekkl commented 4 years ago

We are using OpenShift version 3.11 and I'm pretty certain we're using SkyDNS.

The problem we are experiencing is that our corporate DNS server ties into a different DNS further up that performs NXDOMAIN Redirection to an advertising site instead of giving us an NXDOMAIN response. Therefore, we are seeing a failure like:

failed to create volume: Post http://heketi-storage.glusterfs.svc:8080/volumes: dial tcp 92.242.140.68:8080: i/o timeout

The 92,242.140.68 IP is the advertising site.

Apparently SkyDNS depends on an NXDOMAIN response in order to append .cluster.local.

Examples: [root@appnode2~]# nslookup heketi-storage.glusterfs.svc Server: 10.x.x.x Address: 10.x.x.x#53

Non-authoritative answer: Name: heketi-storage.glusterfs.svc Address: 92.242.140.68

[root@appnode2 ~]# cat /etc/resolv.conf # nameserver updated by /etc/NetworkManager/dispatcher.d/99-origin-dns.sh # Generated by NetworkManager search cluster.local corp.company.com

And so, since the server is not getting the NXDOMAIN response, it doesn't append .cluster.local as per the /etc/resolv.conf file

If we add .cluster.local to the request, it resolves correctly:

[root@appnode2~]$ nslookup heketi-storage.glusterfs.svc.cluster.local Server: 10.x.x.x Address: 10.x.x.x#53

Name: heketi-storage.glusterfs.svc.cluster.local Address: 10.x.x.x < correct internal IP