minrk commented 4 years ago

I was trying to demo binderlyzer and was assigned to the turing cluster. The notebook fetches from https://archive.analytics.mybinder.org and it failed DNS resolution for mybinder.org. I tested a few mybinder.org subdomains, including the top-level mybinder.org, and it couldn't resolve any mybinder.org domain. Manually assigning the pod to gke or ovh (didn't try gesis) doesn't have the same problem, so it appears to be a cluster DNS issue with turing specifically.

Simple test:

!wget https://mybinder.org

will hang at Resolving mybinder.org (mybinder.org)...

I'm not sure what's up there, maybe something with cert-manager?

consideRatio commented 4 years ago

Hmmm perhaps kubectl exec into the hub pod and try ’nslookup mybinder.org 8.8.8.8’ and then ’nslookup mybinder.org’ and if the latter doesnt work inspect the k8s clusters DNS server and potentially network policies that could block/allow access to it. The DNS server can sometimes be found as the service kube-dns.kubesystem even though coredns is now used etc to be the kubernetes clusters dns server.

Kubectl get pods -n kube-system (look for kube-dns or coredns) Kubectl svc -n kube-system (look for kube-dns or coredns etc) Kubectl get netpol --all-namespaces (look for rules saying what outgoing traffic for the hub pod is allowed while testing dns from the hub pod)

betatim commented 4 years ago

From the hub pod dig mybinder.org @8.8.8.8 and dig mybinder.org both resolve to 35.202.202.188. The singleuser pods don't have dig or nslookup installed so I am not sure how to test on one of them.

betatim commented 4 years ago

There are two coredns pods running in kube-system. They've both been up for about 30h.

The log of the coredns autoscaler pod:

I0616 07:59:25.009028       1 k8sclient.go:277] Cluster status: SchedulableNodes[8], SchedulableCores[32]
I0616 07:59:25.010202       1 k8sclient.go:278] Replicas are not as expected : updating replicas from 2 to 3
I0616 09:23:35.006851       1 k8sclient.go:277] Cluster status: SchedulableNodes[7], SchedulableCores[28]
I0616 09:23:35.006880       1 k8sclient.go:278] Replicas are not as expected : updating replicas from 3 to 2
I0616 09:26:15.006489       1 k8sclient.go:277] Cluster status: SchedulableNodes[8], SchedulableCores[32]
I0616 09:26:15.006570       1 k8sclient.go:278] Replicas are not as expected : updating replicas from 2 to 3
I0616 10:13:35.007460       1 k8sclient.go:277] Cluster status: SchedulableNodes[7], SchedulableCores[28]
I0616 10:13:35.007494       1 k8sclient.go:278] Replicas are not as expected : updating replicas from 3 to 2
I0616 10:24:45.009684       1 k8sclient.go:277] Cluster status: SchedulableNodes[8], SchedulableCores[32]
I0616 10:24:45.009721       1 k8sclient.go:278] Replicas are not as expected : updating replicas from 2 to 3
I0616 11:52:55.008657       1 k8sclient.go:277] Cluster status: SchedulableNodes[7], SchedulableCores[28]
I0616 11:52:55.008690       1 k8sclient.go:278] Replicas are not as expected : updating replicas from 3 to 2

I am not quite sure how to read the date/timestamp. Does I0616 mean "Information", june, 16th? In which case it would have been from yesterday. The coredns pods print the following two lines to their log over and over again:

[WARNING] No files matching import glob pattern: custom/*.override
[WARNING] No files matching import glob pattern: custom/*.server

manics commented 4 years ago

The singleuser pods don't have dig or nslookup installed so I am not sure how to test on one of them.

You can test it with python:

https://gke.mybinder.org/v2/gist/manics/2545224d3c19ab381bfc899fa34c6e44/master?filepath=checkdns.ipynb (local resolver works, external DNS is blocked) vs
https://turing.mybinder.org/v2/gist/manics/2545224d3c19ab381bfc899fa34c6e44/master?filepath=checkdns.ipynb (both fail)

sgibson91 commented 4 years ago

Is there anything that I might need to raise with Turing IT to get this resolved? They set up the subdomain *.mybinder.turing.ac.uk so I won't have access to the the turing.ac.uk side of things.

manics commented 4 years ago

I don't think it's the turing DNS that's the problem. If you launch https://turing.mybinder.org/v2/gist/manics/2545224d3c19ab381bfc899fa34c6e44/master?filepath=checkdns.ipynb on the Turing cluster you'll see that queries to the default DNS resolver are blocked, so it's most likely K8s or Z2JH configuration issue.

Could you try temporarily disabling the Z2JH network policies?

consideRatio commented 4 years ago

@manics @sgibson91 oh, In the latest version of the z2jh helm chart, I've made the default network policies allow for DNS traffic! It has also enabled network policies by default - a breaking change.

sgibson91 commented 4 years ago

So from what I can tell, we inherit network policies directly from the chart, here. The turing config contains no references to network policies. So why is this only a problem on turing? Or @consideRatio should we remove the network policies from the chart completely?

consideRatio commented 4 years ago

@sgibson91 I don't know enough, but to clarify the situation, you can do:

# list all network policy resources
kubectl get netpol --all-namespaces

# Inspect the pods running in kube-system, if you see something about Calico for example
# you have a network policy controller running in the k8s cluster, and only if you have such,
# the network policies will actually influence network traffic.
kubectl get pods -n kube-system

Relevant facts:

If you use a somewhat recent z2jh helm chart including this PR, you get the network policies that includes allowance of DNS requests (port 53).
If you use a very recent z2jh helm chart including this PR, the z2jh helm charts NetworkPolicy resources are created by default as compared to not, which requires manual enablement of hub.netpol.enabled set to true for example.

Suggestion: I think using the same configuration as the other deployment is what makes the most sense, and if you have issues specifically in Turing relating to DNS that doesn't seem to arise from networkpolicy resources, then I'd debug it using nslookup example.domain.name from within k8s pods where there are issues, and then I would inspect the firewall rules relating to UDP and TCP traffic of port 53. I'd check also with ping mydnsnameserverip and nslookup www.google.com 8.8.8.8 to use the well known google nameserver 8.8.8.8. Note that if you don't have nslookup available, it can be installed as an apt package named dnsutils, and ping can be installed as iputils-ping. Also, if you want, you can also create a pod to run commands from temporarily like this.

# decide on a namespace to work within...

# start a temporary pod for debugging that keeps running
kubectl run debugging-pod --image=busybox --restart=Never -- sleep infinity
# add labels to it so its targeted by the network policies related to singleuser-server pods
kubectl label pod debugging-pod --overwrite app=jupyterhub component=singleuser-server release=turing

# run one command inside the pod
kubectl exec -it debugging-pod -- nslookup www.google.com 8.8.8.8

# cleanup
kubectl delete pod debugging-pod

sgibson91 commented 4 years ago

$ kubectl get netpol --all-namespaces
NAMESPACE   NAME             POD-SELECTOR                                                AGE
turing      banned-ingress   app=nginx-ingress,component=controller,release=turing       15d
turing      binder-users     component in (dind,singleuser-server),release=turing        15d
turing      hub              app=jupyterhub,component=hub,release=turing                 15d
turing      proxy            app=jupyterhub,component=proxy,release=turing               15d
turing      singleuser       app=jupyterhub,component=singleuser-server,release=turing   15d

For the network policy controller, I used Azure's plugin. Option was either this or kubenet, seems Calico is no longer supported (I remember it used to be).

$ kubectl get pods -n kube-system
NAME                                         READY   STATUS    RESTARTS   AGE
azure-cni-networkmonitor-lf7q9               1/1     Running   1          23d
azure-cni-networkmonitor-p4k9s               1/1     Running   1          23d
azure-cni-networkmonitor-p8lpq               1/1     Running   0          2d19h
azure-cni-networkmonitor-x999b               1/1     Running   0          2d19h
azure-cni-networkmonitor-z7gcp               1/1     Running   2          23d
azure-cni-networkmonitor-zmn7f               1/1     Running   0          2d19h
azure-ip-masq-agent-2wfzd                    1/1     Running   1          23d
azure-ip-masq-agent-7wn5v                    1/1     Running   0          2d19h
azure-ip-masq-agent-mjgml                    1/1     Running   0          2d19h
azure-ip-masq-agent-pkkdp                    1/1     Running   2          23d
azure-ip-masq-agent-zpgs9                    1/1     Running   0          2d19h
azure-ip-masq-agent-zqkhx                    1/1     Running   1          23d
azure-npm-cl5vd                              1/1     Running   0          2d19h
azure-npm-qkzd7                              1/1     Running   0          2d19h
azure-npm-tnp6d                              1/1     Running   1          21d
azure-npm-vffhf                              1/1     Running   2          21d
azure-npm-w9zn6                              1/1     Running   0          2d19h
azure-npm-wbdcx                              1/1     Running   1          21d
coredns-869cb84759-b6kkm                     1/1     Running   2          23d
coredns-869cb84759-f78cc                     1/1     Running   2          23d
coredns-autoscaler-5b867494f-pqvgp           1/1     Running   2          23d
dashboard-metrics-scraper-566c858889-8c5rp   1/1     Running   2          23d
kube-proxy-9ghsb                             1/1     Running   0          2d19h
kube-proxy-9tzqs                             1/1     Running   1          23d
kube-proxy-hcjx5                             1/1     Running   0          2d19h
kube-proxy-k6nks                             1/1     Running   0          2d19h
kube-proxy-njqv6                             1/1     Running   2          23d
kube-proxy-q27ql                             1/1     Running   1          23d
kubernetes-dashboard-7f7d6bbd7f-5chb6        1/1     Running   2          23d
metrics-server-5f4c878d8-tstdh               1/1     Running   0          17d
tiller-deploy-64c6dd8d6b-8vzxq               1/1     Running   1          23d
tunnelfront-7cb79788bd-xcbws                 1/1     Running   0          17d

consideRatio commented 4 years ago

@sgibson91 I added a labeling step in the comment I wrote above, that way, you can trial network connectivity from a busybox container influenced by the same networking constraints as a user pod.

# add labels to it so its targeted by the network policies related to singleuser-server pods
kubectl label pod debugging-pod --overwrite app=jupyterhub component=singleuser-server release=turing

I wonder if the issue that makes turing run into something but not GKE etc, is that the DNS lookup end up being influenced in your deployment, but not in the other clusters, perhaps because a somewhat local DNS server is redirecting traffic that is routed without being associated with the pod itself? Hmm... In general though, I wonder if the other deployments and turing has explicitly allowed DNS traffic, because to my knowledge, it is forbidden by the network policies in old z2jh helm charts.

sgibson91 commented 4 years ago

@sgibson91 I don't know enough

You already know way more than I do 🙂

sgibson91 commented 3 years ago

Output of this:

# start a temporary pod for debugging that keeps running
kubectl run debugging-pod --image=busybox --restart=Never -- sleep infinity
# add labels to it so its targeted by the network policies related to singleuser-server pods
kubectl label pod debugging-pod --overwrite app=jupyterhub component=singleuser-server release=turing

# run one command inside the pod
kubectl exec -it debugging-pod -- nslookup www.google.com 8.8.8.8

... is the following:

Server:     8.8.8.8
Address:    8.8.8.8:53

Non-authoritative answer:
Name:   www.google.com
Address: 172.217.19.196

*** Can't find www.google.com: No answer

I don't really know what this means?

consideRatio commented 3 years ago

I think it means your lookup resulted in an answer which was cached from a DNS server, but when that led to another lookup, that ended up failing. Im not sure, but the desired outcome is that you get a response without any hickup like that final message.

sgibson91 commented 3 years ago

I followed up with Turing IT on this and now I'm pretty sure it's something within the cluster. Question is, what? I wonder if it's related to Azure's network policy https://docs.microsoft.com/en-us/azure/aks/use-network-policies#create-an-aks-cluster-and-enable-network-policy

manics commented 3 years ago

This is still a problem, which means notebooks that retrieve data from an external web resource will fail (unfortunately most of the ones I use do, though today is the first time I've ended up on the Turing cluster).

If you're allowed to give others access I'm happy to poke around your cluster when I have time.

sgibson91 commented 3 years ago

Allowed, yes. Can do it easily, not really. Last time I got an external collaborator added to a subscription, it involved getting them a Turing email address and signing the Turing's privacy policy. @betatim has access though.

consideRatio commented 3 years ago

Is this cluster the only AKS based cluster in the mybinder federation?

Perhaps this is relevant, an issue that covers two potentially separate topics related to AKS:

Overriding the KUBERNETS_SERVICE_HOST environment variable in the hub pod to avoid DNS issues (https://github.com/jupyterhub/kubespawner/issues/282#issuecomment-457577684, https://github.com/jupyterhub/kubespawner/issues/282#issuecomment-716553725)
Bumping to z2jh 0.10.2 with modern kubespawner to avoid unreliable spawns: https://github.com/jupyterhub/binderhub/pull/1184

manics commented 3 years ago

If @consideRatio's suggestions don't help we could perhaps have a screen-sharing session where you share your view of the Azure admin interface and a few of us think of random things to poke? 😄

sgibson91 commented 3 years ago

Now I'm back from leave - I'm very happy to share screen with anyone interested!

manics commented 3 years ago

How about next week? Do you want to start a \<doodle-or-something-similar> poll?

sgibson91 commented 3 years ago

Poll link here: https://terminplaner.dfn.de/hLK738XeCNeWjeYe If those who are interested could complete it before end of day on Friday, I'll set up a meet. Cheers!

sgibson91 commented 3 years ago

Thanks folks, I've closed the poll. @consideRatio and @manics, I've sent you both a calendar invite for Tuesday next week :)

manics commented 3 years ago

Summary of discussion

We ran https://github.com/nicolaka/netshoot as an interactive pod in a new K8S namespace, network egress was allowed (nslookup www.google.com)
We repeated the above in the turing namespace where BinderHub runs, it worked
We copied the pod labels from a singleuser pod onto the netshoot pod, DNS and http requests failed. The BinderHub and Z2JH network policies apply to this pod:
- The mybinder.org BinderHub config allows this list of ports: https://github.com/jupyterhub/mybinder.org-deploy/blob/563c2f2983d05366202a9572fc9fe3b05a38042a/mybinder/values.yaml#L41-L51
- The mybinder.org Z2JH config disables all egress apart from the mandatory hub:8081: https://github.com/jupyterhub/mybinder.org-deploy/blob/563c2f2983d05366202a9572fc9fe3b05a38042a/mybinder/values.yaml#L263-L266
  - Note this overrides the default Z2JH singleuser egress config: https://github.com/jupyterhub/zero-to-jupyterhub-k8s/blob/38e50c71130fcf56655685f0992f4f125bef3879/jupyterhub/values.yaml#L233-L239
Removing the singleuser label meant only the BinderHub network policy applied, egress worked again.
Adding the singleuser label stopped egress working
Modifying the singleuser network policy to allow port 53 to 10.0.0.0/8 meant DNS lookups worked again

Documentation on K8s network policies suggests that if multiple network policies apply to a pod the result should be that any ingress or egress permitted by any of the rules will apply ("Combining Multiple Policies" on https://www.tufin.com/blog/kubernetes-network-policies-for-security-people). These results suggest the singleuser network policy is overriding the binder-users policy on the Turing cluster.

This isn't seen on the other federation members, which suggests one of:

the ordering of network policies is not well defined
the network controllers on the other clusters don't fully enforce network policies
there's something weird about the Turing AKS network controller

https://github.com/jupyterhub/mybinder.org-deploy/pull/1715 is a proposed quick-fix, but we should keep this issue open until we understand the underlying problem.

consideRatio commented 3 years ago

Wonderful summary @manics! :heart:

Documentation on K8s network policies suggests that if multiple network policies apply to a pod the result should be that any ingress or egress permitted by any of the rules will apply ("Combining Multiple Policies" on https://www.tufin.com/blog/kubernetes-network-policies-for-security-people).

I understand it as you are saying that that it will be any of the NetworkPolicy resources that will make the call, while I understand the referenced document to say that all NetworkPolicy resources will combine like a logical OR statement, which means that all NetworkPolicy resources are to be considered and if any one of them allows for the network traffic it is okay.

The official k8s documentation summarized it well.

By default, pods are non-isolated; they accept traffic from any source.

Pods become isolated by having a NetworkPolicy that selects them. Once there is any NetworkPolicy in a namespace selecting a particular pod, that pod will reject any connections that are not allowed by any NetworkPolicy. (Other pods in the namespace that are not selected by any NetworkPolicy will continue to accept all traffic.)

Network policies do not conflict; they are additive. If any policy or policies select a pod, the pod is restricted to what is allowed by the union of those policies' ingress/egress rules. Thus, order of evaluation does not affect the policy result.

With this, I think we can conclude that there is a bug with the AKS NetworkPolicy enforcement somehow, because what we observed broke this premise.

consideRatio commented 3 years ago

I'm not sure what bug it is, but it seems they do have bugs in their NetworkPolicy enforcement specific to AKS, see for example: https://github.com/Azure/AKS/issues/1135.

consideRatio commented 3 years ago

I looked around in the Azure/AKS repo and could not find an issue that was a clear representation of our issue, but I think it is safe to say that Azure "NPM" or "Network policy manager" is not ready for production after seeing several failures to comply with the Kubernetes intent for NetworkPolicy resources and that their NPM differs from calico.

manics commented 3 years ago

https://github.com/jupyterhub/mybinder.org-deploy/pull/1715 was merged.

Testing with https://turing.mybinder.org/v2/gist/manics/2545224d3c19ab381bfc899fa34c6e44/master?filepath=checkdns.ipynb the behaviour is now inconsistent. Sometimes the default DNS server works, other times it fails but the external DNS server 8.8.8.8 works.

betatim commented 3 years ago

How did you make the second cell (of the notebook) fail? I ran it many times (like 30 times) and it seemed to work always :-/

manics commented 3 years ago

I didn't do anything special, I just ran it several times!

manics commented 3 years ago

Seems to be working fine for me now.... Maybe it was a transient K8s networking problem that's resolved itself?

minrk commented 3 years ago

Given that it's blocked traffic to kubernetes internal DNS, this should be fixed by https://github.com/jupyterhub/zero-to-jupyterhub-k8s/pull/1670, specifically the explicit, unconditional addition of port 53 in allowed egress: https://github.com/jupyterhub/zero-to-jupyterhub-k8s/pull/1670/files#diff-70983994b57e310d348c747400dd1ae5ea9f3e4efe63da0155789c3a6bd2a411R41-R46

So there are two possible fixes that might help:

bump the jupyterhub chart (we should do this anway, we haven't bumped it since 0.9)
add port 53 explicitly to the egress in the singleuser policy in this chart (i.e. add the same egress rules in the jupyterhub 0.10 chart)

minrk commented 3 years ago

I've never seen netshoot, that's cool! I ran my own test today to see if I could learn anything and accidentally stumbled upon a real mystery!

I duplicated the two network policies, adding -test to selectors, etc. and ran netshoot. So binder-users-test had:

  podSelector:
    matchExpressions:
    - key: component
      operator: In
      values:
      - dind-test
      - singleuser-server-test
    matchLabels:
      release: turing

and singleuser-test had:

  podSelector:
    matchLabels:
      app: jupyterhub
      component: singleuser-server-test
      release: turing

Creating a pod with labels:

      app: jupyterhub
      component: singleuser-server-test
      release: turing

However, deleting the dind-test item* from the matchExpressions list results in it working (?!). Upon discovering this, I dug a little more, and the issue is that our matchExpressions selector is not being considered to apply to the singleuser pods at all. This is clearly wrong, and a bug in the AKS network policy enforcement evaluation of podSelectors (who knows where!).

This means that the bug was that our binder policy was (and is) not being applied at all to user pods on AKS. The result being that the jupyterhub netpol is the only policy, and it denies DNS by default. Disabling the singleuser policy fixes the DNS issue because it means that there is no network policy that applies to single-user pods, returning to allowing all egress traffic.

consideRatio commented 3 years ago

@minrk ah nice pinning of the issue! I think you can report this in https://github.com/Azure/AKS

minrk commented 3 years ago

opened https://github.com/Azure/AKS/issues/2006

we could get around this by using a loop and generating two policies using matchLabels.

sgibson91 commented 3 years ago

I span up a new (small) cluster with the kubenet plugin and calico policy controller and this seems to have resolved the issue. Now we know that it'll be worth my time to tear the Turing cluster down and redeploy :D

jupyterhub / mybinder.org-deploy

DNS resolution of *.mybinder.org on turing #1468

Summary of discussion