Closed terricain closed 2 years ago
^^ fixed cni version typo :)
So, @terrycain and I are both working on this. If it's helpful I can convert the Chef Cookbooks we've written to deploy these clusters into a simplified bash version along with some basic Terraform. Hopefully would make things easier to replicate.
This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days
We have a Kubernetes cluster in Azure, running Kubernetes 1.18.6 , Azure CNI 1.1.5 on Ubuntu 16.04. As the kubelet starts it runs the iptables command from here.
What happened:
Upgraded a worker to 18.04. The kubelet was failing to register then node.
Kubelet log: lots of
I eventually saw
So I added
/sbin/iptables -t nat -I POSTROUTING -d 169.254.169.254 -j RETURN
before running kubelet (with systemd ExecStartPre), kubelet started but was failing to start a pod in the default namespaceI also repeated it with CNI 1.1.3 and got the same errors.
What you expected to happen:
It to just work
How to reproduce it:
Not entirely sure how you would as everyone is doing AKS these days :D Stand up a VNet in azure, 2 VMs Ubuntu 18.04 Setup a master node, and a worker node (with the 30 IPs on the NIC) On the worker node start kubelet with the 2 iptables rules mentioned above
Orchestrator and Version (e.g. Kubernetes, Docker):
Kubernetes 1.18.6 Containerd 1.2.13
Operating System (Linux/Windows):
Ubuntu 18.04
Kernel (e.g.
uanme -a
for Linux or$(Get-ItemProperty -Path "C:\windows\system32\hal.dll").VersionInfo.FileVersion
for Windows): Linux tools-test-worker-00 5.3.0-1032-azure #33~18.04.1-Ubuntu SMP Fri Jun 26 15:01:15 UTC 2020 x86_64 x86_64 x86_64 GNU/LinuxAnything else we need to know?:
iptables whilst kubelet was crashing (using /sbin/iptables -t nat -A POSTROUTING -m addrtype ! --dst-type local ! -d 10.4.0.0/18 -j MASQUERADE)
I snipped out some rules, but I don't think that they were relevant ``` root@tools-test-worker-00:~# iptables -t nat -L Chain PREROUTING (policy ACCEPT) target prot opt source destination KUBE-SERVICES all -- anywhere anywhere /* kubernetes service portals */ Chain INPUT (policy ACCEPT) target prot opt source destination Chain OUTPUT (policy ACCEPT) target prot opt source destination KUBE-SERVICES all -- anywhere anywhere /* kubernetes service portals */ Chain POSTROUTING (policy ACCEPT) target prot opt source destination KUBE-POSTROUTING all -- anywhere anywhere /* kubernetes postrouting rules */ MASQUERADE all -- anywhere !10.4.0.0/18 ADDRTYPE match dst-type !LOCAL Chain KUBE-KUBELET-CANARY (0 references) target prot opt source destination Chain KUBE-MARK-DROP (2 references) target prot opt source destination MARK all -- anywhere anywhere MARK or 0x8000 Chain KUBE-MARK-MASQ (68 references) target prot opt source destination MARK all -- anywhere anywhere MARK or 0x4000 Chain KUBE-NODEPORTS (1 references) target prot opt source destination KUBE-MARK-MASQ tcp -- localhost/8 anywhere /* ingress-nginx/ingress-nginx:https */ tcp dpt:30001 KUBE-XLB-4E7KSV2ABIFJRAUZ tcp -- anywhere anywhere /* ingress-nginx/ingress-nginx:https */ tcp dpt:30001 KUBE-MARK-MASQ tcp -- localhost/8 anywhere /* ingress-nginx/ingress-nginx:http */ tcp dpt:30000 KUBE-XLB-REQ4FPVT7WYF4VLA tcp -- anywhere anywhere /* ingress-nginx/ingress-nginx:http */ tcp dpt:30000 Chain KUBE-POSTROUTING (1 references) target prot opt source destination RETURN all -- anywhere anywhere mark match ! 0x4000/0x4000 MARK all -- anywhere anywhere MARK xor 0x4000 MASQUERADE all -- anywhere anywhere /* kubernetes service traffic requiring SNAT */ Chain KUBE-PROXY-CANARY (0 references) target prot opt source destination ... Chain KUBE-SEP-4VDVLS7C74EHFEEO (1 references) target prot opt source destination KUBE-MARK-MASQ all -- 10.4.0.13 anywhere /* kube-system/coredns:dns */ DNAT udp -- anywhere anywhere /* kube-system/coredns:dns */ udp to:10.4.0.13:53 Chain KUBE-SEP-63BCXCZXITMJZ6KL (1 references) target prot opt source destination KUBE-MARK-MASQ all -- 10.4.0.37 anywhere /* default/kibana:http */ DNAT tcp -- anywhere anywhere /* default/kibana:http */ tcp to:10.4.0.37:5601 ... Chain KUBE-SERVICES (2 references) target prot opt source destination KUBE-MARK-MASQ tcp -- !10.4.0.0/18 172.16.0.1 /* default/kubernetes:https cluster IP */ tcp dpt:https KUBE-SVC-NPX46M4PTMTKRN6Y tcp -- anywhere 172.16.0.1 /* default/kubernetes:https cluster IP */ tcp dpt:https KUBE-MARK-MASQ udp -- !10.4.0.0/18 172.16.0.10 /* kube-system/coredns:dns cluster IP */ udp dpt:domain KUBE-SVC-ZRLRAB2E5DTUX37C udp -- anywhere 172.16.0.10 /* kube-system/coredns:dns cluster IP */ udp dpt:domain KUBE-MARK-MASQ tcp -- !10.4.0.0/18 172.16.0.10 /* kube-system/coredns:dns-tcp cluster IP */ tcp dpt:domain KUBE-SVC-FAITROITGXHS3QVF tcp -- anywhere 172.16.0.10 /* kube-system/coredns:dns-tcp cluster IP */ tcp dpt:domain KUBE-MARK-MASQ tcp -- !10.4.0.0/18 172.16.11.131 /* default/kibana:http cluster IP */ tcp dpt:http KUBE-SVC-GYQBIT3U2LMZ4H3E tcp -- anywhere 172.16.11.131 /* default/kibana:http cluster IP */ tcp dpt:http KUBE-NODEPORTS all -- anywhere anywhere /* kubernetes service nodeports; NOTE: this must be the last rule in this chain */ ADDRTYPE match dst-type LOCAL Chain KUBE-SVC-FAITROITGXHS3QVF (1 references) target prot opt source destination KUBE-SEP-V2TWSMP5HSEO77UF all -- anywhere anywhere /* kube-system/coredns:dns-tcp */ statistic mode random probability 0.50000000000 KUBE-SEP-AQIH42UCMJIZ52GE all -- anywhere anywhere /* kube-system/coredns:dns-tcp */ Chain KUBE-SVC-GYQBIT3U2LMZ4H3E (1 references) target prot opt source destination KUBE-SEP-63BCXCZXITMJZ6KL all -- anywhere anywhere /* default/kibana:http */ Chain KUBE-XLB-4E7KSV2ABIFJRAUZ (1 references) target prot opt source destination KUBE-SVC-4E7KSV2ABIFJRAUZ all -- 10.4.0.0/18 anywhere /* Redirect pods trying to reach external loadbalancer VIP to clusterIP */ KUBE-MARK-MASQ all -- anywhere anywhere /* masquerade LOCAL traffic for ingress-nginx/ingress-nginx:https LB IP */ ADDRTYPE match src-type LOCAL KUBE-SVC-4E7KSV2ABIFJRAUZ all -- anywhere anywhere /* route LOCAL traffic for ingress-nginx/ingress-nginx:https LB IP to service chain */ ADDRTYPE match src-type LOCAL KUBE-MARK-DROP all -- anywhere anywhere /* ingress-nginx/ingress-nginx:https has no local endpoints */ Chain KUBE-XLB-REQ4FPVT7WYF4VLA (1 references) target prot opt source destination KUBE-SVC-REQ4FPVT7WYF4VLA all -- 10.4.0.0/18 anywhere /* Redirect pods trying to reach external loadbalancer VIP to clusterIP */ KUBE-MARK-MASQ all -- anywhere anywhere /* masquerade LOCAL traffic for ingress-nginx/ingress-nginx:http LB IP */ ADDRTYPE match src-type LOCAL KUBE-SVC-REQ4FPVT7WYF4VLA all -- anywhere anywhere /* route LOCAL traffic for ingress-nginx/ingress-nginx:http LB IP to service chain */ ADDRTYPE match src-type LOCAL KUBE-MARK-DROP all -- anywhere anywhere /* ingress-nginx/ingress-nginx:http has no local endpoints */ ```
So I'm sure I'm missing something simple. If you need any more info/logs just ask, I can easily add a 18.04 worker to the cluster I have running.