kubernetes-sigs / sig-windows-tools

Repository for tools and artifacts related to the sig-windows charter in Kubernetes. Scripts to assist kubeadm and wincat and flannel will be hosted here.
Apache License 2.0
123 stars 123 forks source link

kube-flannel-ds-windows-amd64 crashloop after pod restart #344

Closed kzombro-pfl closed 1 year ago

kzombro-pfl commented 1 year ago

Describe the bug After restarting a pod in the kube-flannel-ds-windows-amd64 daemonset, the pod crashloops with the following:

FATA[2023-08-31T09:03:28-12:00] rpc error: code = Internal desc = could not create IP forward entry: The object already exists. 
I0831 09:03:31.629069   10044 main.go:518] Determining IP address of default interface
E0831 09:03:33.330098   10044 main.go:204] Failed to find any valid interface to use: failed to get default interface: json: cannot unmarshal array into Go value of type struct { IfIndex int "json:\"ifIndex\"" }
2023-08-31T21:03:35.592633400Z

Rebooting the windows node appears to fix the issue. However, should the kube-flannel-ds-windows-amd64 restart again, the crashloop starts.

To Reproduce Steps to reproduce the behavior:

---
kind: ConfigMap
apiVersion: v1
metadata:
  name: kube-flannel-windows-cfg
  namespace: kube-system
  labels:
    tier: node
    app: flannel
data:
  run.ps1: |
    $ErrorActionPreference = "Stop";

    mkdir -force /host/etc/cni/net.d
    mkdir -force /host/etc/kube-flannel
    mkdir -force /host/opt/cni/bin
    mkdir -force /host/k/flannel
    mkdir -force /host/k/flannel/var/run/secrets/kubernetes.io/serviceaccount

    $containerRuntime = "docker"
    if (Test-Path /host/etc/cni/net.d/0-containerd-nat.json) {
      $containerRuntime = "containerd"
    }

    Write-Host "Configuring CNI for $containerRuntime"

    $serviceSubnet = yq r /etc/kubeadm-config/ClusterConfiguration networking.serviceSubnet
    $podSubnet = yq r /etc/kubeadm-config/ClusterConfiguration networking.podSubnet
    $networkJson = wins cli net get | convertfrom-json

    if ($containerRuntime -eq "docker") {
      $cniJson = get-content /etc/kube-flannel-windows/cni-conf.json | ConvertFrom-Json

      $cniJson.delegate.policies[0].Value.ExceptionList = $serviceSubnet, $podSubnet, $networkJson.SubnetCIDR
      $cniJson.delegate.policies[1].Value.DestinationPrefix = $serviceSubnet
      $cniJson.delegate.policies[2].Value.DestinationPrefix = $networkJson.AddressCIDR

      Set-Content -Path /host/etc/cni/net.d/10-flannel.conf ($cniJson | ConvertTo-Json -depth 100)
    } elseif ($containerRuntime -eq "containerd") {
      $cniJson = get-content /etc/kube-flannel-windows/cni-conf-containerd.json | ConvertFrom-Json

      $cniJson.delegate.AdditionalArgs[0].Value.Settings.Exceptions = $serviceSubnet, $podSubnet, $networkJson.SubnetCIDR
      $cniJson.delegate.AdditionalArgs[1].Value.Settings.DestinationPrefix = $serviceSubnet
      $cniJson.delegate.AdditionalArgs[2].Value.Settings.DestinationPrefix = $networkJson.AddressCIDR

      Set-Content -Path /host/etc/cni/net.d/10-flannel.conf ($cniJson | ConvertTo-Json -depth 100)
    }

    cp -force /etc/kube-flannel/net-conf.json /host/etc/kube-flannel
    cp -force -recurse /cni/* /host/opt/cni/bin
    cp -force /k/flannel/* /host/k/flannel/
    cp -force /kube-proxy/kubeconfig.conf /host/k/flannel/kubeconfig.yml
    cp -force /var/run/secrets/kubernetes.io/serviceaccount/* /host/k/flannel/var/run/secrets/kubernetes.io/serviceaccount/
    wins cli process run --path /k/flannel/setup.exe --args "--mode=l2bridge --interface=Ethernet"
    wins cli route add --addresses 169.254.169.254
    wins cli process run --path /k/flannel/flanneld.exe --args "--kube-subnet-mgr --kubeconfig-file /k/flannel/kubeconfig.yml" --envs "POD_NAME=$env:POD_NAME POD_NAMESPACE=$env:POD_NAMESPACE"
  cni-conf.json: |
    {
      "name": "cbr0",
      "cniVersion": "0.3.0",
      "type": "flannel",
      "capabilities": {
        "dns": true
      },
      "delegate": {
        "type": "win-bridge",
        "hairpinMode": true,
        "isDefaultGateway": true,
        "policies": [
          {
            "Name": "EndpointPolicy",
            "Value": {
              "Type": "OutBoundNAT",
              "ExceptionList": []
            }
          },
          {
            "Name": "EndpointPolicy",
            "Value": {
              "Type": "ROUTE",
              "DestinationPrefix": "",
              "NeedEncap": true
            }
          },
          {
            "Name": "EndpointPolicy",
            "Value": {
              "Type": "ROUTE",
              "DestinationPrefix": "",
              "NeedEncap": true
            }
          }
        ]
      }
    }
  cni-conf-containerd.json: |
    {
      "cniVersion": "0.2.0",
      "name": "cbr0",
      "type": "flannel",
      "capabilities": {
        "portMappings": true,
        "dns": true
      },
      "delegate": {
        "type": "sdnbridge",
        "optionalFlags" : {
          "forceBridgeGateway" : true
        },
        "AdditionalArgs": [
          {
            "Name": "EndpointPolicy",
            "Value": {
              "Type": "OutBoundNAT",
              "Settings": {
                "Exceptions": []
              }
            }
          },
          {
            "Name": "EndpointPolicy",
            "Value": {
              "Type": "SDNROUTE",
              "Settings": {
                "DestinationPrefix": "",
                "NeedEncap": true
              }
            }
          },
          {
            "Name": "EndpointPolicy",
            "Value": {
              "Type": "SDNROUTE",
              "Settings": {
                "DestinationPrefix": "",
                "NeedEncap": true
              }
            }
          }
        ]
      }
    }
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: kube-flannel-ds-windows-amd64
  labels:
    tier: node
    app: flannel
  namespace: kube-system
spec:
  selector:
    matchLabels:
      app: flannel
  template:
    metadata:
      labels:
        tier: node
        app: flannel
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: kubernetes.io/os
                    operator: In
                    values:
                      - windows
                  - key: kubernetes.io/arch
                    operator: In
                    values:
                      - amd64
      hostNetwork: true
      serviceAccountName: flannel
      tolerations:
        - operator: Exists
          effect: NoSchedule
      containers:
        - name: kube-flannel
          image: sigwindowstools/flannel:v0.13.0-nanoserver
          command:
            - pwsh
          args:
            - -file
            - /etc/kube-flannel-windows/run.ps1
          volumeMounts:
            - name: wins
              mountPath: \\.\pipe\rancher_wins
            - name: host
              mountPath: /host
            - name: kube-proxy
              mountPath: /kube-proxy
            - name: cni
              mountPath: /etc/cni/net.d
            - name: flannel-cfg
              mountPath: /etc/kube-flannel/
            - name: flannel-windows-cfg
              mountPath: /etc/kube-flannel-windows/
            - name: kubeadm-config
              mountPath: /etc/kubeadm-config/
          env:
            - name: POD_NAME
              valueFrom:
                fieldRef:
                  apiVersion: v1
                  fieldPath: metadata.name
            - name: POD_NAMESPACE
              valueFrom:
                fieldRef:
                  apiVersion: v1
                  fieldPath: metadata.namespace
      volumes:
        - name: opt
          hostPath:
            path: /opt
        - name: host
          hostPath:
            path: /
        - name: cni
          hostPath:
            path: /etc
        - name: flannel-cfg
          configMap:
            name: kube-flannel-cfg
        - name: flannel-windows-cfg
          configMap:
            name: kube-flannel-windows-cfg
        - name: kube-proxy
          configMap:
            name: kube-proxy
        - name: kubeadm-config
          configMap:
            name: kubeadm-config
        - name: wins
          hostPath:
            path: \\.\pipe\rancher_wins
            # type: null # or ''

Expected behavior The flannel pod should find a valid interface to use; it should restart and be healthy.

Kubernetes (please complete the following information):

Additional context Rebooting the windows node - the flannel pod comes up healthy. On the windows machine with the flapping flannel pod, it is possible to forcibly delete the cbr0 network, and this appears to alleviate the issue but it causes a network interruption we can't afford

Get-HnsEndpoint | ? { $_.Name -eq 'cbr0_ep' } | Remove-HnsEndpoint

Appreciate your insights on this issue Kevin

kzombro-pfl commented 1 year ago

I almost feel like this could be a flannel issue...

kzombro-pfl commented 1 year ago

Fixed the issue. flannel was struggling to find the default ethernet interface to use and hack away at

i changed the following line in run.ps1

FROM

    wins cli process run --path /k/flannel/flanneld.exe --args "--kube-subnet-mgr --kubeconfig-file /k/flannel/kubeconfig.yml" --envs "POD_NAME=$env:POD_NAME POD_NAMESPACE=$env:POD_NAMESPACE"

TO

    wins cli process run --path /k/flannel/flanneld.exe --args "--kube-subnet-mgr --kubeconfig-file /k/flannel/kubeconfig.yml --iface-regex=Ethernet0" --envs "POD_NAME=$env:POD_NAME POD_NAMESPACE=$env:POD_NAMESPACE"

and flannel can start up and find the right interface without a hitch.

our windows server has a lot of vEthernet interfaces...

Perhaps we could extend some documentation pointing out that the flanneld.exe args may need to be extended under these circumstances