k3s-io / k3s

Lightweight Kubernetes
https://k3s.io
Apache License 2.0
28.04k stars 2.35k forks source link

Intermittent kube-dns Service CIDR IP address conflict #10611

Closed apcheamitru closed 1 month ago

apcheamitru commented 3 months ago

Environmental Info: K3s Version:

k3s version v1.24.17+k3s1 (026bb0ec)
go version go1.20.7

Node(s) CPU architecture, OS, and Version:

Linux k3s.host 6.1.0-22-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.94-1 (2024-06-21) x86_64 GNU/Linux

Cluster Configuration: Single node -- contents of /etc/rancher/k3s/config.yaml:

---
# https://rancher.com/docs/k3s/latest/en/installation/install-options/#configuration-file
cluster-cidr: 10.42.0.0/24
cluster-dns: 10.43.0.10
debug: false
disable:
- local-storage
- traefik
disable-cloud-controller: true
disable-kube-proxy: false
disable-network-policy: true
node-name: localhost.localdomain
service-cidr: 10.43.0.0/24
snapshotter: overlayfs

Describe the bug: I observed an issue where a ClusterIP service with no ClusterIP specified was assigned IP address 10.43.0.10 -- the cluster-dns IP. It must've been assigned that IP before CoreDNS started. This prevented kube-dns and other services from starting successfully.

I was able to work around the issue by deleting and recreating the offending service. It came up with a new IP address and all services were able to start successfully once the conflict was removed.

Is there a way to prevent the cluster-dns IP address from being assigned to other services?

Steps To Reproduce: Intermittent issue -- cannot reliably reproduce.

Expected behavior: Service IP addresses do not overlap configured cluster-dns IP address.

Actual behavior: Service IP address can overlap configured cluster-dns IP address.

Additional context / logs: From k3s journald log:

Jul 30 15:45:01 k3s.host k3s[4037]: time="2024-07-30T15:45:01Z" level=error msg="Failed to process config: failed to process /var/lib/rancher/k3s/server/manifests/coredns.yaml: failed to create kube-system/kube-dns /v1, Kind=Service for  kube-system/coredns: Service \"kube-dns\" is invalid: spec.clusterIPs: Invalid value: []string{\"10.43.0.10\"}: failed to allocate IP 10.43.0.10: provided IP is already allocated"
github-actions[bot] commented 1 month ago

This repository uses a bot to automatically label issues which have not had any activity (commit/comment/label) for 45 days. This helps us manage the community issues better. If the issue is still relevant, please add a comment to the issue so the bot can remove the label and we know it is still valid. If it is no longer relevant (or possibly fixed in the latest release), the bot will automatically close the issue in 14 days. Thank you for your contributions.