kubernetes / kops

Kubernetes Operations (kOps) - Production Grade k8s Installation, Upgrades and Management
https://kops.sigs.k8s.io/
Apache License 2.0
15.96k stars 4.65k forks source link

Node-local-dns doesn't work with cilium CNI on kops 1.29.0 #16597

Open nikita-nazemtsev opened 5 months ago

nikita-nazemtsev commented 5 months ago

/kind bug

1. What kops version are you running? The command kops version, will display this information. Client version: 1.29.0 (git-v1.29.0)

2. What Kubernetes version are you running? kubectl version will print the version if a cluster is running or provide the Kubernetes version specified as a kops flag. 1.28.7

3. What cloud provider are you using? AWS

4. What commands did you run? What is the simplest way to reproduce this issue? Update Kops from 1.28.4 to 1.29.0, or create a new cluster using Kops 1.29.0 with Node Local DNS and Cilium CNI.

5. What happened after the commands executed? Pods on updated nodes cannot access node-local-dns pods image

6. What did you expect to happen? Pods can access node-local-dns pods.

7. Please provide your cluster manifest. Execute kops get --name my.example.com -o yaml to display your cluster manifest. You may want to remove your cluster name and other sensitive information.

apiVersion: kops.k8s.io/v1alpha2
kind: Cluster
metadata:
  creationTimestamp: "2024-05-31T07:47:47Z"
  name: k8s.tmp-test.example
spec:
  additionalSans:
  - api-internal.k8s.tmp-test.example
  - api.internal.k8s.tmp-test.example
  api:
    loadBalancer:
      type: Internal
      useForInternalApi: true
  authentication: {}
  authorization:
    rbac: {}
  certManager:
    enabled: true
    managed: false
  cloudConfig:
    awsEBSCSIDriver:
      enabled: true
  cloudProvider: aws
  clusterAutoscaler:
    enabled: true
  configBase: s3://example/k8s.tmp-test.example
  containerd:
    registryMirrors:
      '*':
      - https://nexus-proxy.example.io
      docker.io:
      - https://nexus-proxy.example.io
      k8s.gcr.io:
      - https://nexus-proxy.example.io
      public.ecr.aws:
      - https://nexus-proxy.example.io
      quay.io:
      - https://nexus-proxy.example.io
      registry.example.io:
      - https://registry.example.io
  etcdClusters:
  - cpuRequest: 200m
    etcdMembers:
    - instanceGroup: master-1a
      name: a
    - instanceGroup: master-1b
      name: b
    - instanceGroup: master-1c
      name: c
    manager:
      backupRetentionDays: 90
      env:
      - name: ETCD_LISTEN_METRICS_URLS
        value: http://0.0.0.0:2379
      - name: ETCD_METRICS
        value: basic
      - name: ETCD_MANAGER_HOURLY_BACKUPS_RETENTION
        value: 1d
      - name: ETCD_MANAGER_DAILY_BACKUPS_RETENTION
        value: 1d
      - name: ETCD_MAX_REQUEST_BYTES
        value: "1572864"
    memoryRequest: 100Mi
    name: main
  - cpuRequest: 100m
    etcdMembers:
    - instanceGroup: master-1a
      name: a
    - instanceGroup: master-1b
      name: b
    - instanceGroup: master-1c
      name: c
    manager:
      backupRetentionDays: 90
      env:
      - name: ETCD_MANAGER_HOURLY_BACKUPS_RETENTION
        value: 1d
      - name: ETCD_MANAGER_DAILY_BACKUPS_RETENTION
        value: 1d
      - name: ETCD_MAX_REQUEST_BYTES
        value: "1572864"
    memoryRequest: 100Mi
    name: events
  fileAssets:
  - content: |
      apiVersion: audit.k8s.io/v1
      kind: Policy
      rules:
      - level: RequestResponse
        userGroups:
        - "/devops"
        - "/developers"
        - "/teamleads"
        - "/k8s-full"
        - "/sre"
        - "/support"
        - "/qa"
        - "system:serviceaccounts"
    name: audit-policy-config
    path: /etc/kubernetes/audit/policy-config.yaml
    roles:
    - ControlPlane
  iam:
    legacy: false
  kubeAPIServer:
    auditLogMaxAge: 10
    auditLogMaxBackups: 1
    auditLogMaxSize: 100
    auditLogPath: /var/log/kube-apiserver-audit.log
    auditPolicyFile: /etc/kubernetes/audit/policy-config.yaml
    oidcClientID: kubernetes
    oidcGroupsClaim: groups
    oidcIssuerURL: https://sso.example.io/auth/realms/example
    serviceAccountIssuer: https://api-internal.k8s.tmp-test.example
    serviceAccountJWKSURI: https://api-internal.k8s.tmp-test.example/openid/v1/jwks
  kubeDNS:
    affinity:
      nodeAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
          nodeSelectorTerms:
          - matchExpressions:
            - key: kops.k8s.io/instancegroup
              operator: In
              values:
              - infra-nodes
    nodeLocalDNS:
      cpuRequest: 25m
      enabled: true
      memoryRequest: 5Mi
    provider: CoreDNS
    tolerations:
    - effect: NoSchedule
      key: dedicated/infra
      operator: Exists
  kubeProxy:
    enabled: false
  kubelet:
    anonymousAuth: false
    authenticationTokenWebhook: true
    authorizationMode: Webhook
    evictionHard: memory.available<7%,nodefs.available<3%,nodefs.inodesFree<5%,imagefs.available<10%,imagefs.inodesFree<5%
    evictionMaxPodGracePeriod: 30
    evictionSoft: memory.available<12%
    evictionSoftGracePeriod: memory.available=200s
  kubernetesApiAccess:
  - 10.170.0.0/16
  kubernetesVersion: 1.28.7
  masterPublicName: api.k8s.tmp-test.example
  metricsServer:
    enabled: false
    insecure: true
  networkCIDR: 10.170.0.0/16
  networkID: vpc-xxxx
  networking:
    cilium:
      enableNodePort: true
      enablePrometheusMetrics: true
      ipam: eni
  nodeProblemDetector:
    enabled: true
  nodeTerminationHandler:
    enableSQSTerminationDraining: false
    enabled: true
  nonMasqueradeCIDR: 100.64.0.0/10
  sshAccess:
  - 10.170.0.0/16
  subnets:
  - cidr: 10.170.140.0/24
    name: kops-k8s-1a
    type: Private
    zone: eu-central-1a
  - cidr: 10.170.142.0/24
    name: kops-k8s-eni-1a
    type: Private
    zone: eu-central-1a
  - cidr: 10.170.141.0/24
    name: kops-k8s-utility-1a
    type: Utility
    zone: eu-central-1a
  - cidr: 10.170.143.0/24
    name: kops-k8s-1b
    type: Private
    zone: eu-central-1b
  - cidr: 10.170.145.0/24
    name: kops-k8s-eni-1b
    type: Private
    zone: eu-central-1b
  - cidr: 10.170.144.0/24
    name: kops-k8s-utility-1b
    type: Utility
    zone: eu-central-1b
  - cidr: 10.170.146.0/24
    name: kops-k8s-1c
    type: Private
    zone: eu-central-1c
  - cidr: 10.170.148.0/24
    name: kops-k8s-eni-1c
    type: Private
    zone: eu-central-1c
  - cidr: 10.170.147.0/24
    name: kops-k8s-utility-1c
    type: Utility
    zone: eu-central-1c
  topology:
    dns:
      type: Private

---

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: "2024-05-31T07:47:48Z"
  labels:
    kops.k8s.io/cluster: k8s.tmp-test.example
  name: graviton-nodes
spec:
  autoscale: false
  image: ami-0192de4261c8ff06a
  machineType: t4g.small
  maxSize: 1
  minSize: 1
  mixedInstancesPolicy:
    instances:
    - t4g.small
    onDemandAboveBase: 0
    onDemandBase: 0
    spotAllocationStrategy: lowest-price
    spotInstancePools: 3
  nodeLabels:
    kops.k8s.io/instancegroup: graviton-nodes
  role: Node
  rootVolumeSize: 25
  subnets:
  - kops-k8s-1a
  - kops-k8s-eni-1a
  sysctlParameters:
  - net.netfilter.nf_conntrack_max = 1048576
  - net.core.netdev_max_backlog = 30000
  - net.core.rmem_max = 134217728
  - net.core.wmem_max = 134217728
  - net.ipv4.tcp_wmem = 4096 87380 67108864
  - net.ipv4.tcp_rmem = 4096 87380 67108864
  - net.ipv4.tcp_mem = 187143 249527 1874286
  - net.ipv4.tcp_max_syn_backlog = 8192
  - net.ipv4.ip_local_port_range = 10240 65535

---

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: "2024-05-31T07:47:48Z"
  labels:
    kops.k8s.io/cluster: k8s.tmp-test.example
  name: infra-nodes
spec:
  autoscale: false
  image: ami-035f7f826413ac489
  machineType: t3.small
  maxSize: 1
  minSize: 1
  mixedInstancesPolicy:
    instances:
    - t3.small
    onDemandAboveBase: 0
    onDemandBase: 0
    spotAllocationStrategy: lowest-price
    spotInstancePools: 3
  nodeLabels:
    kops.k8s.io/instancegroup: infra-nodes
  role: Node
  rootVolumeSize: 25
  subnets:
  - kops-k8s-1a
  - kops-k8s-eni-1a
  sysctlParameters:
  - net.netfilter.nf_conntrack_max = 1048576
  - net.core.netdev_max_backlog = 30000
  - net.core.rmem_max = 134217728
  - net.core.wmem_max = 134217728
  - net.ipv4.tcp_wmem = 4096 12582912 16777216
  - net.ipv4.tcp_rmem = 4096 12582912 16777216
  - net.ipv4.tcp_mem = 187143 249527 1874286
  - net.ipv4.tcp_max_syn_backlog = 8192
  - net.ipv4.ip_local_port_range = 10240 65535

---

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: "2024-05-31T07:47:47Z"
  labels:
    kops.k8s.io/cluster: k8s.tmp-test.example
  name: master-1a
spec:
  image: ami-035f7f826413ac489
  machineType: t3.medium
  maxSize: 1
  minSize: 1
  nodeLabels:
    kops.k8s.io/instancegroup: master-1a
  role: Master
  rootVolumeSize: 25
  subnets:
  - kops-k8s-1a
  - kops-k8s-eni-1a

---

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: "2024-05-31T07:47:47Z"
  labels:
    kops.k8s.io/cluster: k8s.tmp-test.example
  name: master-1b
spec:
  image: ami-035f7f826413ac489
  machineType: t3.medium
  maxSize: 1
  minSize: 1
  nodeLabels:
    kops.k8s.io/instancegroup: master-1b
  role: Master
  rootVolumeSize: 25
  subnets:
  - kops-k8s-1b
  - kops-k8s-eni-1b

---

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: "2024-05-31T07:47:48Z"
  labels:
    kops.k8s.io/cluster: k8s.tmp-test.example
  name: master-1c
spec:
  image: ami-035f7f826413ac489
  machineType: t3.medium
  maxSize: 1
  minSize: 1
  nodeLabels:
    kops.k8s.io/instancegroup: master-1c
  role: Master
  rootVolumeSize: 25
  subnets:
  - kops-k8s-1c
  - kops-k8s-eni-1c

---

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: "2024-05-31T07:47:48Z"
  labels:
    kops.k8s.io/cluster: k8s.tmp-test.example
  name: nodes
spec:
  image: ami-035f7f826413ac489
  machineType: t3.small
  maxSize: 1
  minSize: 1
  mixedInstancesPolicy:
    instances:
    - t3.small
    onDemandAboveBase: 0
    onDemandBase: 0
    spotAllocationStrategy: lowest-price
    spotInstancePools: 3
  nodeLabels:
    kops.k8s.io/instancegroup: nodes
  role: Node
  rootVolumeSize: 25
  subnets:
  - kops-k8s-1a
  - kops-k8s-eni-1a
  sysctlParameters:
  - net.netfilter.nf_conntrack_max = 1048576
  - net.core.netdev_max_backlog = 30000
  - net.core.rmem_max = 134217728
  - net.core.wmem_max = 134217728
  - net.ipv4.tcp_wmem = 4096 87380 67108864
  - net.ipv4.tcp_rmem = 4096 87380 67108864
  - net.ipv4.tcp_mem = 187143 249527 1874286
  - net.ipv4.tcp_max_syn_backlog = 8192
  - net.ipv4.ip_local_port_range = 10240 65535

8. Please run the commands with most verbose logging by adding the -v 10 flag. Paste the logs into this report, or in a gist and provide the gist link here.

9. Anything else do we need to know?

We found a workaround to fix this issue on a single node: We noticed that the nodelocaldns interface is in a down state on nodes. ( but the same we can observe on older kops versions where node-local-dns works fine) image But after executing ip link set dev nodelocaldns up Nodelocaldns interface: image In cilium agent logs on this node we can see: time="2024-05-31T09:23:08Z" level=info msg="Node addresses updated" device=nodelocaldns node-addresses="169.254.20.10 (nodelocaldns)" subsys=node-address

After these actions, all pods on this node can access node-local-dns without any problems.

DmytroKozlovskyi commented 4 months ago

Hi, Have got the same issue while upgrading Kops from 1.28 to 1.29. This is quite a critical bug/regression in Kops 1.29 which blocks us from upgrading. Any known workarounds for this?

rifelpet commented 4 months ago

/reopen I wasn't able to repro the issue but I did upgrade Cilium to the latest 1.15 patch version. If you're able to build kops from source, can you build the kops CLI from this branch, run kops update cluster --yes and see if the issue is fixed?

k8s-ci-robot commented 4 months ago

@rifelpet: Reopened this issue.

In response to [this](https://github.com/kubernetes/kops/issues/16597#issuecomment-2184088724): >/reopen >I wasn't able to repro the issue but I did upgrade Cilium to the latest 1.15 patch version. If you're able to build kops from source, can you [build the kops CLI](https://kops.sigs.k8s.io/contributing/building/) from [this branch](https://github.com/kubernetes/kops/pull/16628#issuecomment-2184081102), run `kops update cluster --yes` and see if the issue is fixed? Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.
nikita-nazemtsev commented 4 months ago

I recreated the cluster using kops from a branch, but it didn't solve the issue.

image

kruczjak commented 4 months ago

I'm not sure whether it's connected or not but except of the nodelocaldns also not working I have an experimental IPv6-only cluster with cilium. I've tried upgrading it from kops v1.28 to v1.29 but the endpoints in cilium are unreachable on nodes.

I've looked what's changed in cilium setup and I found that hostNetwork: true was added to both cilium-operator and cilium DaemonSet. I suspect that it's somehow connected with both issues but I couldn't find exact issue.

math3vz commented 4 months ago

Is kube-dns service created in kube-system?

DmytroKozlovskyi commented 4 months ago

Hi, yes, kube-dns service is created in kube-system. Also, there is a cilium doc on how to configure node-local-dns with cilium https://docs.cilium.io/en/v1.10/gettingstarted/local-redirect-policy/#node-local-dns-cache One interesting part is that node-local-dns must run as regular pod with hostNetwork: false, what not is the case in current Kops deployment. Also, CiliumLocalRedirectPolicy must be added. Took this from this issue: https://github.com/cilium/cilium/issues/16906

k8s-triage-robot commented 1 month ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 2 days ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten