kubernetes-sigs / kubespray

Deploy a Production Ready Kubernetes Cluster
Apache License 2.0
16.02k stars 6.45k forks source link

CoreDNS pod goes to CrashLoopBackOff State #11093

Closed sakshiarora13 closed 5 months ago

sakshiarora13 commented 5 months ago

What happened?

While trying to bring up kubernetes cluster, CoreDNS pods sometimes goes to CrashLoopBackOff State. image

After looking at logs, my assumption was something being wrong in /etc/resolv.conf file. The configurations done in /etc/resolv.conf are same always. But intermittently it goes to CrashLoopBackOff. image

image

coreDNS is using /etc/resolv.conf file: image

But Kubelet is using /etc/kubernetes/kubelet-config.yaml image

So can someone help me understand which config file is actually being used?

I tried removing 127.0.0.53 from /etc/resolv.conf file but still coreDNS was in CrashLoopBackOff.

What did you expect to happen?

CoreDNS pod should be running

How can we reproduce it (as minimally and precisely as possible)?

(Not always reproducable) Kubespray version 1.23.1 run cluster.yml with inventory

OS

Ubuntu: 22.04

Version of Ansible

ansible [core 2.14.14]

Version of Python

3.9.16

Version of Kubespray (commit)

10679eb

Network plugin used

calico

Full inventory with variables

kube_version: "v1.26.12"
container_manager: "containerd"
dashboard_enabled: true
helm_enabled: true
kube_network_plugin: "calico"
metallb_enabled: true
metallb_speaker_enabled: true
kube_proxy_strict_arp: true
kube_proxy_mode: 'iptables'
override_system_hostname: false
populate_inventory_to_hosts_file: false
enable_nodelocaldns: false
unsafe_show_logs: true

Command used to invoke ansible

invoked using ansible collection

Output of ansible run

completed with success

Anything else we need to know

No response

wandersonlima commented 5 months ago

The coredns pods are in a loop. Because of this, they keep restarting.

Try changing the forward field in the configmap.

forward . dns_ip {
          prefer_udp
          max_concurrent 1000
}
sakshiarora13 commented 5 months ago

@wandersonlima

I tried the same. image

coredns pods are still in CrashLoopBackOff image

Logs: image

10.20.0.2 is my server IP image

my /etc/resolv.conf: image

wandersonlima commented 5 months ago

@sakshiarora13 The IP placed in the forward field cannot be the CoreDNS address, otherwise, a loop will occur. You need to fill it with a local or external DNS.

sakshiarora13 commented 5 months ago

filled it with my local DNS server and it worked now. Thanks a lot for your help @wandersonlima :)