kubernetes-sigs / sig-windows-tools

Repository for tools and artifacts related to the sig-windows charter in Kubernetes. Scripts to assist kubeadm and wincat and flannel will be hosted here.
Apache License 2.0
125 stars 122 forks source link

Networking issues with calico on windows nodes - no internet connectivity #378

Open Breee opened 6 days ago

Breee commented 6 days ago

Describe the bug

DNS request timed out. timeout was 2 seconds. DNS request timed out. timeout was 2 seconds. DNS request timed out. timeout was 2 seconds.

k exec iis-demo-795f98f84d-lmrx5 -it -- ipconfig

Windows IP Configuration Ethernet adapter vEthernet (8cc68cfad871ed1e55d5408e88553e50ecbf1e420975154524012f33b9ecf69c_Calico): Connection-specific DNS Suffix . : default.svc.cluster.local Link-local IPv6 Address . . . . . : fe80::854b:7b85:43c9:44a7%34 IPv4 Address. . . . . . . . . . . : 10.42.249.79 Subnet Mask . . . . . . . . . . . : 255.255.255.192 Default Gateway . . . . . . . . . : 10.42.249.65


**To Reproduce**

Cluster API manifest for kubeadm

apiVersion: cluster.x-k8s.io/v1beta1 kind: Cluster metadata: labels: cluster.x-k8s.io/cluster-name: ${CI_COMMIT_REF_NAME} cni-windows: calico windows: enabled name: ${CI_COMMIT_REF_NAME} namespace: default spec: clusterNetwork: pods: cidrBlocks:

apiServer: enabled: true


script: 
  export CALICO_VERSION="v3.28.1"
  kubectl apply --server-side --force-conflicts -f https://raw.githubusercontent.com/projectcalico/calico/${CALICO_VERSION}/manifests/operator-crds.yaml
  helm repo add projectcalico https://docs.tigera.io/calico/charts
  kubectl apply -f calico/namespace.yaml
  cat calico/endpoints.yaml  | \
    envsubst | \
    kubectl apply -f -
  cat calico/values.yaml  | \
  envsubst | \
  helm install --version ${CALICO_VERSION} --namespace tigera-operator  calico projectcalico/tigera-operator --values -
  sleep 30
  while ! kubectl get installation default; do
      echo "Waiting for installation default to exist..."
      sleep 10
  done
  while ! kubectl get ippool default-ipv4-ippool; do
      echo "Waiting for ippool default-ipv4-ippool to exist..."
      sleep 10
  done
  kubectl patch ippool default-ipv4-ippool --type='json' -p='[{"op": "replace", "path": "/spec/vxlanMode", "value": "Always"}]'
  while ! kubectl get ipamconfig default; do
      echo "Waiting for ipamconfig default to exist..."
      sleep 10
  done
  kubectl patch ipamconfig default --type merge --patch='{"spec": {"strictAffinity": true}}'
  curl -L https://raw.githubusercontent.com/kubernetes-sigs/sig-windows-tools/master/hostprocess/calico/kube-proxy/kube-proxy.yml | sed 's/KUBE_PROXY_VERSION/v1.31.1/g' | kubectl apply -f -

full rendered installation: 

apiVersion: operator.tigera.io/v1 kind: Installation metadata: annotations: meta.helm.sh/release-name: calico meta.helm.sh/release-namespace: tigera-operator creationTimestamp: "2024-10-09T11:56:28Z" finalizers:

IPAM

$ k get ipamconfig default -o yaml
apiVersion: crd.projectcalico.org/v1
kind: IPAMConfig
metadata:
  annotations:
    projectcalico.org/metadata: '{"creationTimestamp":null}'
  creationTimestamp: "2024-10-09T11:57:01Z"
  generation: 2
  name: default
  resourceVersion: "885"
  uid: d17c15b9-2059-4a2a-b8f0-87614d2570b3
spec:
  autoAllocateBlocks: true
  strictAffinity: true

pool

 k get ippools. default-ipv4-ippool  -o yaml
apiVersion: projectcalico.org/v3
kind: IPPool
metadata:
  creationTimestamp: "2024-10-09T11:56:34Z"
  generation: 1
  labels:
    app.kubernetes.io/managed-by: tigera-operator
  name: default-ipv4-ippool
  resourceVersion: "1455"
  uid: fa602211-17ff-4a75-aca4-099d0983e81e
spec:
  allowedUses:
  - Workload
  - Tunnel
  blockSize: 26
  cidr: 10.42.128.0/17
  ipipMode: Never
  natOutgoing: true
  nodeSelector: all()
  vxlanMode: Always

Expected behavior Pod have internet / dns

Kubernetes (please complete the following information):

Additional context

calico-windows-node: attached as log file caliconode.log

annotations:

$ k get nodes -o yaml | grep projectcalico.org
      projectcalico.org/IPv4Address: 10.13.18.42/24
      projectcalico.org/IPv4VXLANTunnelAddr: 10.42.183.128
      projectcalico.org/IPv4Address: 10.13.18.64/24
      projectcalico.org/IPv4VXLANTunnelAddr: 10.42.170.192
      projectcalico.org/IPv4Address: 10.13.18.9/24
      projectcalico.org/IPv4VXLANTunnelAddr: 10.42.254.64
      projectcalico.org/IPv4Address: 10.13.18.8/24
      projectcalico.org/IPv4VXLANTunnelAddr: 10.42.237.128
      projectcalico.org/IPv4Address: 10.12.12.157/16
      projectcalico.org/IPv4VXLANTunnelAddr: 10.42.249.65
      projectcalico.org/VXLANTunnelMACAddr: 00:15:5d:6c:b8:ef
      projectcalico.org/IPv4Address: 10.12.12.35/16
      projectcalico.org/IPv4VXLANTunnelAddr: 10.42.215.129
      projectcalico.org/VXLANTunnelMACAddr: 00:15:5d:16:e4:e5

nodes

 k get nodes -o wide
NAME                      STATUS   ROLES           AGE   VERSION   INTERNAL-IP    EXTERNAL-IP    OS-IMAGE                       KERNEL-VERSION      CONTAINER-RUNTIME
clippy-md-0-2kt4j-4wthh   Ready    <none>          43h   v1.31.1   10.13.18.42    10.13.18.42    Ubuntu 20.04.6 LTS             5.4.0-195-generic   containerd://1.7.22
clippy-md-0-2kt4j-9l6xw   Ready    <none>          43h   v1.31.1   10.13.18.64    10.13.18.64    Ubuntu 20.04.6 LTS             5.4.0-195-generic   containerd://1.7.22
clippy-md-0-2kt4j-ffmgj   Ready    <none>          43h   v1.31.1   10.13.18.9     10.13.18.9     Ubuntu 20.04.6 LTS             5.4.0-195-generic   containerd://1.7.22
clippy-nn78j              Ready    control-plane   43h   v1.31.1   10.13.18.8     10.13.18.8     Ubuntu 20.04.6 LTS             5.4.0-195-generic   containerd://1.7.22
cw2-ptmtr-nwbvs           Ready    <none>          25h   v1.31.1   10.12.12.157   10.12.12.157   Windows Server 2022 Standard   10.0.20348.2700     containerd://1.7.22
win-w85jm-ldgdk           Ready    <none>          63m   v1.31.1   10.12.12.35    10.12.12.35    Windows Server 2022 Standard   10.0.20348.2700     containerd://1.7.22
$ Get-HnsNetwork
ActivityId             : 9F7B4A29-2870-42B5-B73B-10FF10F5ADB5
AdditionalParams       :
CurrentEndpointCount   : 0
DNSServerCompartment   : 3
DrMacAddress           : 00-15-5D-6C-B8-EF
Extensions             : {@{Id=E7C3B2F0-F3C5-48DF-AF2B-10FED6D72E7A; IsEnabled=False;
                         Name=Microsoft Windows Filtering Platform},
                         @{Id=F74F241B-440F-4433-BB28-00F89EAD20D8; IsEnabled=True;
                         Name=Microsoft Azure VFP Switch Extension},
                         @{Id=430BDADD-BAB0-41AB-A369-94B67FA5BE0A; IsEnabled=True;
                         Name=Microsoft NDIS Capture}}
Flags                  : 0
Health                 : @{LastErrorCode=0; LastUpdateTime=133731058207475297}
ID                     : 7DA4BC16-24A3-4A60-92FB-EAF3196E1FA0
IPv6                   : False
LayeredOn              : 8FA2C2D1-16E0-4693-9DCB-10B8459164D1
MacPools               : {@{EndMacAddress=00-15-5D-C8-FF-FF;
                         StartMacAddress=00-15-5D-C8-F0-00}}
ManagementIP           : 10.12.12.157
MaxConcurrentEndpoints : 0
Name                   : External
Policies               : {}
State                  : 1
Subnets                : {@{AdditionalParams=; AddressPrefix=192.168.255.0/30; Flags=0;
                         GatewayAddress=192.168.255.1; Health=;
                         ID=CA943AB7-B803-4CF8-B107-85B8FEAF949C;
                         IpSubnets=System.Object[]; ObjectType=5;
                         Policies=System.Object[]; State=0}}
TotalEndpoints         : 0
Type                   : Overlay
Version                : 55834574851
Resources              : @{AdditionalParams=; AllocationOrder=1;
                         Allocators=System.Object[]; CompartmentOperationTime=0; Flags=0;
                         Health=; ID=9F7B4A29-2870-42B5-B73B-10FF10F5ADB5;
                         PortOperationTime=0; State=1; SwitchOperationTime=0;
                         VfpOperationTime=0; parentId=D26B2287-32EE-41BD-A150-EDA2DBB20A30}

ActivityId             : 8C92A66D-7817-40AE-AFE6-0BB5824D54D9
AdditionalParams       :
CurrentEndpointCount   : 0
DNSServerCompartment   : 4
DrMacAddress           : 00-15-5D-6C-B8-EF
Extensions             : {@{Id=E7C3B2F0-F3C5-48DF-AF2B-10FED6D72E7A; IsEnabled=False;
                         Name=Microsoft Windows Filtering Platform},
                         @{Id=F74F241B-440F-4433-BB28-00F89EAD20D8; IsEnabled=True;
                         Name=Microsoft Azure VFP Switch Extension},
                         @{Id=430BDADD-BAB0-41AB-A369-94B67FA5BE0A; IsEnabled=True;
                         Name=Microsoft NDIS Capture}}
Flags                  : 0
Health                 : @{LastErrorCode=0; LastUpdateTime=133731058414671503}
ID                     : D1DBE980-BE3E-4049-AA5D-93A4EBEF45B2
IPv6                   : False
LayeredOn              : 8FA2C2D1-16E0-4693-9DCB-10B8459164D1
MacPools               : {@{EndMacAddress=00-15-5D-55-1F-FF;
                         StartMacAddress=00-15-5D-55-10-00}}
ManagementIP           : 10.12.12.157
MaxConcurrentEndpoints : 1
Name                   : Calico
Policies               : {@{DestinationPrefix=10.42.183.128/26;
                         DistributedRouterMacAddress=66-ef-b3-b4-4c-c8; IsolationId=4096;
                         ProviderAddress=10.13.18.42; Type=RemoteSubnetRoute},
                         @{DestinationPrefix=10.42.237.128/26;
                         DistributedRouterMacAddress=66-52-57-21-f1-b0; IsolationId=4096;
                         ProviderAddress=10.13.18.8; Type=RemoteSubnetRoute},
                         @{DestinationPrefix=10.42.170.192/26;
                         DistributedRouterMacAddress=66-84-f6-b1-67-9c; IsolationId=4096;
                         ProviderAddress=10.13.18.64; Type=RemoteSubnetRoute},
                         @{DestinationPrefix=10.42.215.128/26;
                         DistributedRouterMacAddress=00-15-5d-16-e4-e5; IsolationId=4096;
                         ProviderAddress=10.12.12.35; Type=RemoteSubnetRoute}...}
State                  : 1
Subnets                : {@{AdditionalParams=; AddressPrefix=10.42.249.64/26; Flags=0;
                         GatewayAddress=10.42.249.65; Health=;
                         ID=019DD141-81FA-4F45-B567-356817EA46DC;
                         IpSubnets=System.Object[]; ObjectType=5;
                         Policies=System.Object[]; State=0}}
TotalEndpoints         : 2
Type                   : Overlay
Version                : 55834574851
Resources              : @{AdditionalParams=; AllocationOrder=1;
                         Allocators=System.Object[]; CompartmentOperationTime=0; Flags=0;
                         Health=; ID=8C92A66D-7817-40AE-AFE6-0BB5824D54D9;
                         PortOperationTime=0; State=1; SwitchOperationTime=0;
                         VfpOperationTime=0; parentId=D26B2287-32EE-41BD-A150-EDA2DBB20A30}
Breee commented 3 days ago

Workaround is here: https://github.com/microsoft/Windows-Containers/issues/516 We should add a hint to the guide