openshift / openshift-ansible

Install and config an OpenShift 3.x cluster
https://try.openshift.com
Apache License 2.0
2.18k stars 2.31k forks source link

DNS out-of-the-box configuration. #1314

Closed timothysc closed 7 years ago

timothysc commented 8 years ago

Post deployment our default configuration leaves DNS in a state where pods can not communicate with the outside world.

When running upstreams networking conformance tests the pods fail to resolve google.com. In order to rectify, I had to setup dnsmasq on the master. http://developerblog.redhat.com/2015/11/19/dns-your-openshift-v3-cluster/

This is not spelled out as a separate step post installation, but imho there is really no reason the installer can not properly configure dnsmasq on the master.

/cc @rrati @jayunit100 @jeremyeder @mattf

detiber commented 8 years ago

Trello card associated with this issue: https://trello.com/c/uIvwy5Yj

sdodson commented 8 years ago

Post deployment our default configuration leaves DNS in a state where pods can not communicate with the outside world.

The article you've linked to solves many problems but it shouldn't be directly related to this unless google.com overlaps with your default subdomain, overlaps with your host's search path, or your hosts themselves cannot resolve google.com. Can you post the contents of a pod's /etc/resolv.conf?

Below you can see that the pod has skydns listed as its first resolver and then the host's resolvers are next. SkyDNS returns a SERVFAIL so the resolver moves on to the resolvers which are successful.

[root@hello-openshift1 /]# cat /etc/resolv.conf 
nameserver 172.30.0.1
nameserver 192.168.122.1
search default.svc.cluster.local svc.cluster.local cluster.local example.com
options ndots:5

[root@hello-openshift1 /]# dig +nofail google.com            
;; Got SERVFAIL reply from 172.30.0.1, trying next server

; <<>> DiG 9.9.4-RedHat-9.9.4-29.el7_2.2 <<>> +nofail google.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 21539
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 6, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;google.com.                    IN      A

;; ANSWER SECTION:
google.com.             27      IN      A       173.194.68.139
google.com.             27      IN      A       173.194.68.101
google.com.             27      IN      A       173.194.68.102
google.com.             27      IN      A       173.194.68.138
google.com.             27      IN      A       173.194.68.100
google.com.             27      IN      A       173.194.68.113

;; Query time: 0 msec
;; SERVER: 192.168.122.1#53(192.168.122.1)
;; WHEN: Mon Feb 01 15:50:49 EST 2016
;; MSG SIZE  rcvd: 124

[root@hello-openshift1 /]# curl https://www.google.com -v 2>&1 | grep -e subject -e OK
*       subject: CN=www.google.com,O=Google Inc,L=Mountain View,ST=California,C=US
< HTTP/1.1 200 OK
sdodson commented 8 years ago

Also, what is the dnsPolicy of the pod?

timothysc commented 8 years ago

nameserver 172.24.0.1 nameserver 10.1.4.30

pod was run via: sudo ./e2e.test --provider="local" --ginkgo.v=true --ginkgo.focus="should provide Internet connection for containers" --kubeconfig="/etc/origin/master/admin.kubeconfig" --repo-root="/home/cloud-user/kubernetes"

It Got SERVFAIL reply from 172.24.0.1 , but then bailed. Once the dnsmasq was enabled, all was well.

sdodson commented 8 years ago

I suspect this was introduced in the recent rebase

Introduced via https://github.com/kubernetes/kubernetes/pull/18089

issue being discussed at https://github.com/kubernetes/kubernetes/issues/20090

On Mon, Feb 1, 2016 at 4:22 PM, Timothy St. Clair notifications@github.com wrote:

nameserver 172.24.0.1 nameserver 10.1.4.30

pod was run via: sudo ./e2e.test --provider="local" --ginkgo.v=true --ginkgo.focus="should provide Internet connection for containers" --kubeconfig="/etc/origin/master/admin.kubeconfig" --repo-root="/home/cloud-user/kubernetes"

It Got SERVFAIL reply from 172.24.0.1 , but then bailed. Once the dnsmasq was enabled, all was well.

— Reply to this email directly or view it on GitHub https://github.com/openshift/openshift-ansible/issues/1314#issuecomment-178199412 .

timothysc commented 8 years ago

yup.

sdodson commented 8 years ago

@liggitt or @smarterclayton Do you guys know yet what we intend to do about the change to ClusterFirst dnsPolicy? Will this be critical for 3.2?

liggitt commented 8 years ago

I thought we were carrying a patch to preserve the old behavior... are you seeing a change in behavior in origin?

timothysc commented 8 years ago

I deployed origin/master:

commit 86a15eb95324991b0de57de04e71b29cec5ab63f
Date:   Thu Jan 28 09:55:15 2016 -0500

and I'm seeing the behavior outlined.

liggitt commented 8 years ago

Our tests routinely clone from github.com, etc, inside pods...

liggitt commented 8 years ago

https://github.com/openshift/origin/commit/7d76e2e6880c1c9dfddb37e5c270bc1930c059a6 is the carry, which preserves the pre-rebase behavior of including the clusterDNS in the list of nameservers

liggitt commented 8 years ago

@timothysc can you also show the output of

oc get service kubernetes -n default -o yaml
oc get endpoints kubernetes -n default -o yaml

Trying to figure out why you're seeing a behavior change

timothysc commented 8 years ago

@liggitt details requested are here: https://paste.fedoraproject.org/317540/42926714/

I'm guessing the fact that we are running on openstack may have something to do with it.

liggitt commented 8 years ago

was this setup working previously (and if so, when)? trying to nail down what changed

sdodson commented 8 years ago

I think this is a problem with the busybox image in question not moving to the next nameserver on SERVFAIL, this fails on my OSE 3.1.1.6 cluster as well.

On Tue, Feb 2, 2016 at 11:27 AM, Jordan Liggitt notifications@github.com wrote:

was this setup working previously (and if so, when)? trying to nail down what changed

— Reply to this email directly or view it on GitHub https://github.com/openshift/openshift-ansible/issues/1314#issuecomment-178672385 .

liggitt commented 8 years ago

So that is likely the issue with inconsistent resolver behavior with multiple non-overlapping nameservers, which was the reason upstream removed the clusterDNS+hostDNS chaining.

sdodson commented 8 years ago

172.30.0.1 is my kubernetes svc

[root@ose3-master ~]# docker run -it gcr.io/google_containers/busybox
# cat /etc/resolv.conf
# Generated by NetworkManager
search example.com
nameserver 192.168.122.1
# wget -s google.com
Connecting to google.com (173.194.208.138:80)
Connecting to www.google.com (74.125.226.18:80)
# cat /etc/resolv.conf 
# Generated by NetworkManager
search example.com
nameserver 172.30.0.1
nameserver 192.168.122.1
# wget -s google.com
wget: bad address 'google.com'
liggitt commented 8 years ago

and that works correctly with another image, like one of our origin images?

sdodson commented 8 years ago

Yeah, it's fine with fedora/rhel based images which use glibc's resolver.

On Tue, Feb 2, 2016 at 11:41 AM, Jordan Liggitt notifications@github.com wrote:

and that works correctly with another image, like one of our origin images?

— Reply to this email directly or view it on GitHub https://github.com/openshift/openshift-ansible/issues/1314#issuecomment-178679572 .

liggitt commented 8 years ago

so. @smarterclayton, ruling on this? A. turn skydns into an open resolver, and undo our carry B. break non-glibc resolvers C. require additional setup of dnsmasq?

sdodson commented 8 years ago

Also works with latest docker.io/busybox image

On Tue, Feb 2, 2016 at 11:48 AM, Jordan Liggitt notifications@github.com wrote:

so. @smarterclayton https://github.com/smarterclayton, ruling on this? turn our dns into an open resolver, break non-glibc resolvers, or require additional setup of dnsmasq?

— Reply to this email directly or view it on GitHub https://github.com/openshift/openshift-ansible/issues/1314#issuecomment-178682769 .

timothysc commented 8 years ago

imo end user experience is what matters most here, not a fan of (B).

detiber commented 8 years ago

@liggitt @smarterclayton

A. turn skydns into an open resolver, and undo our carry

Instead of turning skydns into an open resolver, could we just configure forwarders to a provided set of dns hosts through the master config?

B. break non-glibc resolvers

I don't like the idea of potentially inconsistent behavior for the user, especially if we are advertising running upstream docker containers.

C. require additional setup of dnsmasq?

I'm not sure we want to travel down that path just yet... Unless there was no other choice.

I don't think it's a bad idea to do it for demo/POC use cases, but supporting it in a production use case is something else...

detiber commented 8 years ago

From this: https://github.com/skynetservices/skydns#configuration it looks like configuring the forwarders is possible and it defaults to the entries in /etc/resolv.conf (which should be a good fallback).

liggitt commented 8 years ago

Right... turning on no_rec was explicitly done in https://github.com/openshift/origin/commit/8ff419a71968caf36dde8501112e7a37c982bc81... hence needing a ruling

sdodson commented 8 years ago

If we switch to upstream behavior I don't need to write the filtering to prevent duplicate entries in pod resolv.conf when we add the kubernetes svc ip to the host's resolv.conf because clusterFirst is really cluster only upstream.

sdodson commented 8 years ago

@brenton This is the issue we discussed during standup, in particular we need to reach a decision on Jordan's 3 proposed options.

My opinion is that we have proof positive that at least some resolvers screw up and that we should go with option A while attempting to secure skydns in a way that it's only accessible to the cluster.

smarterclayton commented 8 years ago

We can make this an optional configuration for admins and tell them to block external DNS traffic into the cluster. They turn on open resolver, they need to close the resolver loop.

On Thu, Feb 4, 2016 at 10:45 AM, Scott Dodson notifications@github.com wrote:

@brenton https://github.com/brenton This is the issue we discussed during standup, in particular we need to reach a decision on Jordan's 3 proposed options.

My opinion is that we have proof positive that at least some resolvers screw up and that we should go with option A while attempting to secure skydns in a way that it's only accessible to the cluster.

— Reply to this email directly or view it on GitHub https://github.com/openshift/openshift-ansible/issues/1314#issuecomment-179908630 .

smarterclayton commented 8 years ago

I don't want C - I don't think it helps us in the short term, and the long term this is a cluster problem.

On Thu, Feb 4, 2016 at 11:16 AM, Clayton Coleman ccoleman@redhat.com wrote:

We can make this an optional configuration for admins and tell them to block external DNS traffic into the cluster. They turn on open resolver, they need to close the resolver loop.

On Thu, Feb 4, 2016 at 10:45 AM, Scott Dodson notifications@github.com wrote:

@brenton https://github.com/brenton This is the issue we discussed during standup, in particular we need to reach a decision on Jordan's 3 proposed options.

My opinion is that we have proof positive that at least some resolvers screw up and that we should go with option A while attempting to secure skydns in a way that it's only accessible to the cluster.

— Reply to this email directly or view it on GitHub https://github.com/openshift/openshift-ansible/issues/1314#issuecomment-179908630 .

detiber commented 8 years ago

@smarterclayton the issue with B is that it'll require further custom logic to handle not duplicating the DNS resolver in the pods..

The simple use case is easy, where the host DNS resolver is pointed at the service IP for skydns. The other use cases are more difficult (where they are pointed to skydns through an IP on a master host or a public IP assigned to a master).

For A, we can always default to locking down the skydns port from all but the service network.

sdodson commented 8 years ago

The service network is just local nat on a node to IP node ips for the master. So wouldn't we have to ensure it's open to all subnets where nodes exist? This seems challenging without having a deny all and having the api-servers add allow rules for each known node ip.

detiber commented 8 years ago

That is true... you would need a deny all for 53 and add an entry for each node on the master.

The plus side, is that you could limit the impact on the other rules by pushing port 53 into it's own chain for processing.

As an alternative, since we already require the masters be on the pod network, we could potentially expose it that way.

smarterclayton commented 8 years ago

Ultimately nodes and ramp nodes will have to access DNS as well - both of those need access to the pod network as well.

On Thu, Feb 4, 2016 at 11:40 AM, Jason DeTiberus notifications@github.com wrote:

That is true... you would need a deny all for 53 and add an entry for each node on the master.

The plus side, is that you could limit the impact on the other rules by pushing port 53 into it's own chain for processing.

As an alternative, since we already require the masters be on the pod network, we could potentially expose it that way.

— Reply to this email directly or view it on GitHub https://github.com/openshift/openshift-ansible/issues/1314#issuecomment-179936068 .

sdodson commented 8 years ago

While working on adding cluster DNS to hosts I discussed with the networking team how best to automate reconfiguring host's resolv.conf etc. They actually suggested using NetworkManager's dns=dnsmasq setting and providing a dnsmasq config to selectively resolve cluster dns zones via the kubesvc IP.

Perhaps Option C the best way forward given it's probably how we'll end up giving host processes access to cluster dns? If so should we set our DNSPolicy: Default so that it just uses the host's resolv.conf and we're effectively all in on dnsmasq?

sdodson commented 8 years ago

This would also mean we'd require NetworkManager, which is slightly controversial but if it's the right tool...

detiber commented 8 years ago

I'm not sure we're going to be able to sell our customer base on using NetworkManager.

detiber commented 8 years ago

I'm of the thinking that if all of the hosts that need to talk to DNS are all on/have access to the pod network, then maybe that is the best option for securing access and having a centralized authoritative DNS solution.

sdodson commented 8 years ago

https://github.com/openshift/origin/pull/7598 adds dnsmasq via NetworkManager dispatcher works out of the box assuming default cluster dns values. My intention is to complete testing of that and deliver that via the origin/atomic-openshift node RPMs. We can then configure dnsIP = ansible_default_ipv4 to switch over to using node local dnsmasq once we're sure of how robust dnsmasq proves to be.

sferich888 commented 8 years ago

https://trello.com/c/a2vUr9KE/12-dns-integration seems to be related to this discussion and I feel that if a "centralized authoritative DNS solution" is used as @detiber points out, we have a much better story on how our DNS solution is "plugable" and you can swap out the "provided solution" for your own.

timothysc commented 8 years ago

So what's the resolution, as of today (v3.2.0.9) the end user experience is still:

/usr/libexec/atomic-openshift/extended.test --ginkgo.v=true --ginkgo.focus="Conformance"

[Fail] ClusterDns [Feature:Example] [It] should create pod that uses dns [Conformance] 
/builddir/build/BUILD/atomic-openshift-git-0.b99af7d/_thirdpartyhacks/src/k8s.io/kubernetes/test/e2e/util.go:1537

[Fail] DNS [It] should provide DNS for the cluster [Conformance] 
/builddir/build/BUILD/atomic-openshift-git-0.b99af7d/_thirdpartyhacks/src/k8s.io/kubernetes/test/e2e/dns.go:229

[Fail] DNS [It] should provide DNS for services [Conformance] 
/builddir/build/BUILD/atomic-openshift-git-0.b99af7d/_thirdpartyhacks/src/k8s.io/kubernetes/test/e2e/dns.go:229

[Fail] Networking [It] should provide Internet connection for containers [Conformance] 
/builddir/build/BUILD/atomic-openshift-git-0.b99af7d/_thirdpartyhacks/src/k8s.io/kubernetes/test/e2e/networking.go:53

/cc @danmcp

detiber commented 8 years ago

What are these tests doing specifically? Are they running from a pod, or from a node?

DNS resolution should work just fine from a pod and should provide the pod with internet connectivity. Do the nodes have internet connectivity/resolution themselves? Is the test using a container image that doesn't not iterate through the list of dns resolvers?

timothysc commented 8 years ago

Is the test using a container image that doesn't not iterate through the list of dns resolvers

It's a uclibc resolver in the (busy-box)image, as mentioned above.

tbielawa commented 7 years ago

This issue has been inactive for quite some time. Please update and reopen this issue if this is still a priority you would like to see action on.

sferich888 commented 7 years ago

Inactive but not resolved? The ask here is (as I understand it is to have OCP, install a dns solution). Has the idea here changed?

sdodson commented 7 years ago

No, this is all about internal dns. The card you've referenced is for providing external facing authoritative dns but that's completely different from the original subject of this issue. In my opinion this should've been closed when #1588 merged.