projectcalico / calico

Cloud native networking and network security
https://docs.tigera.io/calico/latest/about/
Apache License 2.0
6.02k stars 1.34k forks source link

egress traffic tutorial fails #2016

Closed ctaggart closed 6 years ago

ctaggart commented 6 years ago

Expected Behavior

Step 5 to allow egress traffic should work: https://docs.projectcalico.org/v3.1/getting-started/kubernetes/tutorials/advanced-policy

Current Behavior

nslookup is still denied until all network policies with egress are removed

kubectl get netpol --all-namespaces
kubectl delete netpol -n advanced-policy-demo default-deny-egress
kubectl delete netpol -n advanced-policy-demo allow-dns-access

Possible Solution

Steps to Reproduce (for bugs)

Follow the tutorial

Context

I need to set up egress and it isn't working.

Your Environment

My environment is documented here

ctaggart commented 6 years ago

I just tested in Calico 2.6.9 on top of Minikube + Kubernetes 1.9.4, following: https://docs.projectcalico.org/v2.6/getting-started/kubernetes/tutorials/advanced-policy

It also fails for egress. 😞

ctaggart commented 6 years ago

Calico 2.6.2 also fails for egress. I installed it like so:

curl -O -L https://docs.projectcalico.org/v2.6/getting-started/kubernetes/installation/hosted/kubeadm/1.6/calico.yaml
sed -i -e  '/nodeSelector/d' calico.yaml
sed -i -e '/node-role.kubernetes.io\/master: ""/d' calico.yaml
sed -i -e 's/2.6.9/2.6.2/g' calico.yaml
kubectl apply -f calico.yaml
caseydavenport commented 6 years ago

I just tested in Calico 2.6.9 on top of Minikube + Kubernetes 1.9.4, following:

I know we've had some compat issues with the standard minikube setup in the past. I wonder if any of that is coming into play.

It could be a docs bug with the adv guide though.

ctaggart commented 6 years ago

I got it working in our lab environment, which is not minikube. Some differences that I'm not sure if they come into play are that it uses

caseydavenport commented 6 years ago

@ctaggart curious. I wouldn't expect those to impact the guide.

I suspect it's a minikube issue.

bcreane commented 6 years ago

As @ctaggart observes, the advanced tutorial doesn't work, even with kubeadm clusters.

It looks like the issue is that the following line isn't labeling pods correctly: kubectl label namespace kube-system name=kube-system

If you modify the allow-dns-access policy, and delete the line that matches the kube-system name label (name: kube-system), the tutorial works.

@caseydavenport - I'm curious if this could be specific to k8s v1.10? I assume we have FVs that validate that we handle labels correctly ...

bcreane commented 6 years ago

I see this also with k8s v1.8.13. Will try Calico v3.0 to make sure this isn't a regression.

... also fails with k8s v1.8.13 + Calico v2.6.10.

bcreane commented 6 years ago

@ctaggart - I think we both made the same mistake of assuming that wget should run after step 5. In fact the tutorial mentions that only DNS is allowed, so only nslookup works at that point in the tutorial. I re-ran the tutorial, and this time everything is copasetic.

I've opened another ticket to re-work this tutorial according to the Calico style guide which should greatly reduce the likelihood of other people making this mistake.

I'll leave this issue open for another day or two. If you get a chance to try to the tutorial again, I'd appreciate hearing back.

ctaggart commented 6 years ago

@bcreane No, it was nslookup that was failing for me, as I said above.

bcreane commented 6 years ago

@ctaggart, right, missed that. I'm the only one not reading closely today.

In that case, @caseydavenport's supposition that this is a minikube issue looks likely since the tutorial works fine with a kubeadm cluster.

ctaggart commented 6 years ago

Is someone able to reproduce the issue on Minikube? I fully documented my setup.

bcreane commented 6 years ago

@ctaggart - I reproduced the minikube v0.27.0 issue that you describe.

After applying the four policies - deny-all-ingress, allow-nginx-ingress, deny-all-egress and allow-dns-egress - I see substantial differences in the iptables / ipsets rules that calico creates on minikube versus kubeadm. See minikube-iptables versus kubeadm-iptables. Ipsets on kubeadm have the kube-dns ip address (for a particular iptable rule), whereas the minikube ipsets are empty. calicoctl node diags also hangs on minikube - looks like our iptable programming container (felix) times out when dumping its status.

I noticed another interesting difference which points to differences in how minikube and kubeadm handle cluster networking: on kubeadm, intra-pod traffic is visible from the node (e.g. sudo tcpdump -i any host kube-dns-ip-addr and udp and port 53 shows busybox DNS requests on kubeadm), however intra-pod traffic doesn't seem to be directly visible from the node with minikube.

Although the ingress rules appear to be working on minikube, it looks like calico doesn't have a complete view of the cluster and is not necessarily creating correct iptables/ipsets. My quick take is that minikube + calico looks untrustworthy at this point.

bcreane commented 6 years ago

@ctaggart - after consulting with @tmjd, I re-ran minikube, this time specifying --vmdriver none. It looks like the iptables rules are a lot more sane with this flag. Have you tried that setting?

bcreane commented 6 years ago

Yes, looks like specifying --vmdriver none fixes the adv. tutorial. @ctaggart, please give that a try.

ctaggart commented 6 years ago

I can try it out sometime within the next week. Slammed with some deadlines at the moment.

ctaggart commented 6 years ago

You can probably close this. Unfortunately, I'm having trouble setting up a minikube cluster with --vmdriver none. I mentioned that in https://github.com/projectcalico/calico/issues/1456