kubernetes / kubernetes

Production-Grade Container Scheduling and Management
https://kubernetes.io
Apache License 2.0
111.1k stars 39.67k forks source link

All external domains reolved to localhost #127702

Closed hurdonkey closed 1 month ago

hurdonkey commented 1 month ago

What happened?

Hi, friends, my cluster pods can not able to access all external domains, because all external domains are resolved to localhost. troubleshooting revealed that "localhost" has been added to the first line of /etc/resolv.conf in the pod. I found that other clusters don't seem to have this "localhost" item. My cluster version is 1.30 . I noticed that the kubelet clusterDomain config item doesn't have "localhost" either. How do I delete this, and is this a deployment error or something else, I have no idea. Thanks for this trouble!

What did you expect to happen?

/etc/resolve.conf

kubectl exec -it nginx-deployment-c45d79c8-8fmz2 -- cat /etc/resolv.conf
search default.svc.cluster.local svc.cluster.local cluster.local localhost
nameserver 10.96.0.10
options ndots:5

ping result if i use full domain name, thats OK

busybox-deploy-85d854b658-k4v49:~# ping rancher.devops.zenlayer.net
PING rancher.devops.zenlayer.net (::1) 56 data bytes
64 bytes from localhost (::1): icmp_seq=1 ttl=64 time=0.096 ms
64 bytes from localhost (::1): icmp_seq=2 ttl=64 time=0.058 ms
64 bytes from localhost (::1): icmp_seq=3 ttl=64 time=0.035 ms
64 bytes from localhost (::1): icmp_seq=4 ttl=64 time=0.077 ms
^C
--- rancher.devops.zenlayer.net ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3000ms
rtt min/avg/max/mdev = 0.035/0.066/0.096/0.022 ms

busybox-deploy-85d854b658-k4v49:~# ping rancher.devops.zenlayer.net.
PING 06b0020a756244a7930f9efcb8244a76.zga.globalconnetct.com (192.169.127.103) 56(84) bytes of data.
64 bytes from 192.169.127.103: icmp_seq=1 ttl=48 time=42.7 ms
64 bytes from 192.169.127.103: icmp_seq=2 ttl=48 time=42.2 ms
64 bytes from 192.169.127.103: icmp_seq=3 ttl=48 time=42.4 ms
64 bytes from 192.169.127.103: icmp_seq=4 ttl=48 time=41.6 ms
64 bytes from 192.169.127.103: icmp_seq=5 ttl=48 time=42.3 ms

How can we reproduce it (as minimally and precisely as possible)?

Version 1.30, and deploy with kubeadm, maybe.

Anything else we need to know?

kubelet clusterDomain configure

kubectl get cm -n kube-system       kubelet-config -o yaml
apiVersion: v1
data:
  kubelet: |
    apiVersion: kubelet.config.k8s.io/v1beta1
    authentication:
      anonymous:
        enabled: false
      webhook:
        cacheTTL: 0s
        enabled: true
      x509:
        clientCAFile: /etc/kubernetes/pki/ca.crt
    authorization:
      mode: Webhook
      webhook:
        cacheAuthorizedTTL: 0s
        cacheUnauthorizedTTL: 0s
    cgroupDriver: systemd
    clusterDNS:
    - 10.96.0.10
    clusterDomain: cluster.local
    ...

Kubernetes version

```console $ kubectl version Client Version: v1.30.0 Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3 Server Version: v1.30.0 ```

Cloud provider

KVM hosted

OS version

```console # On Linux: $ cat /etc/os-release NAME="CentOS Linux" VERSION="7 (Core)" ID="centos" ID_LIKE="rhel fedora" VERSION_ID="7" PRETTY_NAME="CentOS Linux 7 (Core)" ANSI_COLOR="0;31" CPE_NAME="cpe:/o:centos:centos:7" HOME_URL="https://www.centos.org/" BUG_REPORT_URL="https://bugs.centos.org/" CENTOS_MANTISBT_PROJECT="CentOS-7" CENTOS_MANTISBT_PROJECT_VERSION="7" REDHAT_SUPPORT_PRODUCT="centos" REDHAT_SUPPORT_PRODUCT_VERSION="7" $ uname -a Linux k8s-cluster1-master1 3.10.0-1160.45.1.el7.x86_64 #1 SMP Wed Oct 13 17:20:51 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux ```

Install tools

Container runtime (CRI) and version (if applicable)

Containerd

Related plugins (CNI, CSI, ...) and versions (if applicable)

flannel
k8s-ci-robot commented 1 month ago

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.
hurdonkey commented 1 month ago

/sig network

hurdonkey commented 1 month ago

我找到了原因,因为我节点上的/etc/resolv.conf 包含了 ”search localhost“, kubelet 创建pod时将“localhost” 添加到search中了,导致了以上问题。

代码在这里, https://github.com/kubernetes/kubernetes/blob/93f82d25a5b10577116e4a72f77fb5e635e65490/pkg/kubelet/network/dns/dns.go#L173C46-L173C56

我从 /etc/relov.conf 中删除 localhost, 再创建pod 就好了。