bottlerocket-os / bottlerocket

An operating system designed for hosting containers
https://bottlerocket.dev
Other
8.64k stars 508 forks source link

Setting cluster-domain has no effect #4077

Closed kpanic9 closed 3 months ago

kpanic9 commented 3 months ago

Image I'm using: AMI ID: ami-00006dd3c25cb88b4 Region: ap-southeast-2 Name: bottlerocket-aws-k8s-1.29-x86_64-v1.20.0-fcf71a47

What I expected to happen:

We have an EKS cluster (version 1.29) which uses bottlerocket OS worker nodes. I was trying to change the cluster domain name from cluster.local to {something}.local using bottlerocket setting https://bottlerocket.dev/en/os/1.19.x/api/settings/kubernetes/#cluster-domain After recreating new nodes with this setting, I was expecting the pods created on them to have search domains in /etc/resolv.conf to match the {something}.local, but it still contains the cluster.local

What actually happened: cluster-domain setting has no effect.

search {namespace}.svc.cluster.local svc.cluster.local cluster.local ec2.internal
nameserver 172.20.0.10
options ndots:5

How to reproduce the problem:

larvacea commented 3 months ago

Thank you for the report.

I tried this on one of my running eks worker nodes. It is running the bottlerocket-aws-k8s-1.29-x86_64-v1.20.2-536d69d0 image. When I used apiclient in the control container to set settings.kubernetes.cluster-domain, I did see the new cluster domain in /etc/resolv.conf on a busybox container (and I did see the new domain in the kubelet configuration file, as well).

Applying the change as I did only affects the EC2 instance where I ran apiclient, but at least I can demonstrate that the setting works as expected inside one instance, so the problem is not with the API server and how the setting populates our kubelet configuration template.

Could you describe how you applied the configuration change? It seems likely that the failure is somewhere along the chain of tools between your configuration input and the worker node.

kpanic9 commented 3 months ago

Thanks for checking larvacea. We use karpenter for provisioning nodes. I tried setting the cluster domain using the apiclient as you have done and it does change the setting as expected. Might be an issue with Karpenter.