kelseyhightower / kubernetes-the-hard-way

Bootstrap Kubernetes the hard way. No scripts.
Apache License 2.0
41.3k stars 14.13k forks source link

Step 12: CodeDNS not staring pods #662

Open dinkargupta opened 3 years ago

dinkargupta commented 3 years ago

All worked well till route creation but after that deploying coredns at step 12 is not working.. tried debugging a bit but no insights into what went wrong or missing. no events or log entries related to failure.. actually hardly any entries.. steps finish w/o any error .. just the dns pods are in pending state so nothing else moved forward

tried calico advise too but that also didn't help. what am I missing ? I am running it on GCP

Oyelowo commented 3 years ago

Experienced the same issue

pawelkuk commented 3 years ago

same here

cbbm142 commented 3 years ago

This may not be helpful since this was originally opened in May, but I ran into this issue as well. In my case, it was a problem with my kube-proxy service that was prevent proper communication on the cluster service network (10.32.0.0/24). I had missed the step to here to create the kube-proxy yaml file. Checking systemd showed the kube-proxy working fine, but once I created that yaml file and restarted the proxy, everything worked fine.

wiggitywhitney commented 3 years ago

This may not be helpful since this was originally opened in May, but I ran into this issue as well. In my case, it was a problem with my kube-proxy service that was prevent proper communication on the cluster service network (10.32.0.0/24). I had missed the step to here to create the kube-proxy yaml file. Checking systemd showed the kube-proxy working fine, but once I created that yaml file and restarted the proxy, everything worked fine.

☆彡(ノ^^)ノ This fixed my problem. Thank you so, so much @cbbm142 !!!

mikky-is commented 2 years ago

In case your kube-proxy configuration is okay, you might notice by tracing with IPTables that masquerading is ok on receving the packet, but see no trace of the return packet. After testing, it appears to be a generic problem on compute disks imaged with : ubuntu-2004-focal-v20211202 (not tested with other versions).

Referring to Kubernetes issue #21613, and when your DNS and busybox pods on on the same node, you might need an additional kernel module to reverse dNAT the returning packet.

Installation steps, think of replicating it to all workers:

root@worker-0:~# modprobe br_netfilter
root@worker-0:~# sysctl net.bridge.bridge-nf-call-iptables=1
1

This issue will avoid any pod-to-pod communication on the same node.

perrness commented 1 year ago

For me the issue was that the scheduler had not been started, just failing.