kubernetes-sigs / sig-windows-tools

Repository for tools and artifacts related to the sig-windows charter in Kubernetes. Scripts to assist kubeadm and wincat and flannel will be hosted here.
Apache License 2.0
123 stars 123 forks source link

Added a Windows node to the cluster, but it reports as not ready. #345

Closed aquynh1682 closed 10 months ago

aquynh1682 commented 10 months ago

Describe the bug I followed the instructions in this docs, but when I finished, I tried kubectl get node and it reported that the Windows node is not ready.

After that, I checked kubectl get pod -A, and there are two pods showing errors, namely kube-flannel-ds-windows-amd64 and kube-proxy-windows. Initially, the error for the pod kube-flannelube-flannel-ds-windows-amd64 was unable to pull the images sigwindowstools/flannel:v0.21.5-hostprocess, so I searched for a registry containing those images and got the image sigwindowstools/flannel:v0.14.0-hostprocess. After that, it logged as follows, including both kube-proxy and kube-flannel. Please help me, Thanks very much.

To Reproduce First, I use the command kubectl get node.

kubectl get node -o wide
NAME              STATUS     ROLES                  AGE    VERSION   INTERNAL-IP     EXTERNAL-IP   OS-IMAGE                         KERNEL-VERSION           CONTAINER-RUNTIME
k8s-master-01     Ready      control-plane,master   4h2m   v1.23.6   172.16.68.174   <none>        CentOS Linux 7 (Core)            3.10.0-1062.el7.x86_64   containerd://1.6.24
win-9r639hv9q5r   NotReady   <none>                 21m    v1.23.6   172.16.68.170   <none>        Windows Server 2019 Datacenter   10.0.17763.3650          containerd://1.7.1

Next, I use the command kubectl get pod -A.

kubectl get pod -A
NAMESPACE      NAME                                    READY   STATUS             RESTARTS      AGE
kube-flannel   kube-flannel-ds-69f9p                   1/1     Running            0             110m
kube-flannel   kube-flannel-ds-windows-amd64-gxzxs     0/1     CrashLoopBackOff   7 (51s ago)   12m
kube-system    coredns-64897985d-92gtz                 1/1     Running            0             3h53m
kube-system    coredns-64897985d-n424m                 1/1     Running            0             3h53m
kube-system    etcd-k8s-master-01                      1/1     Running            2             3h53m
kube-system    kube-apiserver-k8s-master-01            1/1     Running            2             3h54m
kube-system    kube-controller-manager-k8s-master-01   1/1     Running            2             3h54m
kube-system    kube-proxy-tm6fk                        1/1     Running            0             3h53m
kube-system    kube-proxy-windows-7wwvr                0/1     CrashLoopBackOff   7 (66s ago)   12m
kube-system    kube-scheduler-k8s-master-01            1/1     Running            2             3h54m

This is the error when trying to pull the images.

kubectl describe pod -n kube-flannel kube-flannel-ds-windows-amd64-gxzxs
Events:
  Type     Reason     Age                 From               Message
  ----     ------     ----                ----               -------
  Normal   Scheduled  108s                default-scheduler  Successfully assigned kube-flannel/kube-flannel-ds-windows-amd64-jdv9d to win-kh70jeb5on9
  Normal   BackOff    73s (x2 over 102s)  kubelet            Error: ImagePullBackOff
  Normal   Pulling    62s (x3 over 107s)  kubelet            Pulling image "sigwindowstools/flannel:v0.21.5-hostprocess"
  Warning  Failed     9s (x3 over 103s)   kubelet            Error: ErrImagePull

And i affter changes images tags v0.21.5-hostprocess to v0.14.0-hostprocess.

controlPlaneEndpoint=$(kubectl get configmap -n kube-system kube-proxy -o jsonpath="{.data['kubeconfig\.conf']}" | grep server: | sed 's/.*\:\/\///g')
kubernetesServiceHost=$(echo $controlPlaneEndpoint | cut -d ":" -f 1)
kubernetesServicePort=$(echo $controlPlaneEndpoint | cut -d ":" -f 2)
curl -L https://raw.githubusercontent.com/kubernetes-sigs/sig-windows-tools/master/hostprocess/flannel/flanneld/flannel-overlay.yml | sed 's/FLANNEL_VERSION/v0.14.0/g' | sed "s/KUBERNETES_SERVICE_HOST_VALUE/$kubernetesServiceHost/g" | sed "s/KUBERNETES_SERVICE_PORT_VALUE/$kubernetesServicePort/g" | kubectl apply -f -

Log error of "kube-flannel-windows"

Copying SDN CNI binaries to host

    Directory: C:\opt\cni

Mode                LastWriteTime         Length Name                                                                  
----                -------------         ------ ----                                                                  
d-----       10/16/2023  11:06 PM                bin                                                                   
copy flannel config

    Directory: C:\etc

Mode                LastWriteTime         Length Name                                                                  
----                -------------         ------ ----                                                                  
d-----       10/16/2023  11:06 PM                kube-flannel                                                          
ls : Cannot find path 'C:\hpc\etc\kube-flannel\' because it does not exist.
At C:\hpc\flannel\start.ps1:11 char:1
+ ls $env:CONTAINER_SANDBOX_MOUNT_POINT/etc/kube-flannel/
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : ObjectNotFound: (C:\hpc\etc\kube-flannel\:String) [Get-ChildItem], ItemNotFoundException
    + FullyQualifiedErrorId : PathNotFound,Microsoft.PowerShell.Commands.GetChildItemCommand

Log error of "kube-proxy-windows"

Copying SDN CNI binaries to host

    Directory: C:\opt\cni

Mode                LastWriteTime         Length Name                                                                  
----                -------------         ------ ----                                                                  
d-----       10/16/2023  11:06 PM                bin                                                                   
copy flannel config

    Directory: C:\etc

Mode                LastWriteTime         Length Name                                                                  
----                -------------         ------ ----                                                                  
d-----       10/16/2023  11:06 PM                kube-flannel                                                          
ls : Cannot find path 'C:\hpc\etc\kube-flannel\' because it does not exist.
At C:\hpc\flannel\start.ps1:11 char:1
+ ls $env:CONTAINER_SANDBOX_MOUNT_POINT/etc/kube-flannel/
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : ObjectNotFound: (C:\hpc\etc\kube-flannel\:String) [Get-ChildItem], ItemNotFoundException
    + FullyQualifiedErrorId : PathNotFound,Microsoft.PowerShell.Commands.GetChildItemCommand

[root@k8s-master-01 new-windows]# kubectl logs -f -n kube-system kube-proxy-windows-7wwvr
Write files so the kubeconfig points to correct locations

    Directory: C:\var\lib

Mode                LastWriteTime         Length Name                                                                  
----                -------------         ------ ----                                                                  
d-----       10/16/2023  11:03 PM                kube-proxy                                                            
Get-Content : Cannot find path 'C:\hpc\var\lib\kube-proxy\kubeconfig.conf' because it does not exist.
At C:\hpc\kube-proxy\start.ps1:55 char:3
+ ((Get-Content -path $env:CONTAINER_SANDBOX_MOUNT_POINT/var/lib/kube-p ...
+   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : ObjectNotFound: (C:\hpc\var\lib\...kubeconfig.conf:String) [Get-Content], ItemNotFoundEx 
   ception
    + FullyQualifiedErrorId : PathNotFound,Microsoft.PowerShell.Commands.GetContentCommand

Expected behavior flannel and kube-proxy not started

Kubernetes (please complete the following information):

Mik4sa commented 10 months ago

Please build your own image, see https://github.com/kubernetes-sigs/sig-windows-tools/issues/336#issuecomment-1633854019 You need the latest version, there is no official image yet.

aquynh1682 commented 10 months ago

Hi @Mik4sa,

I've built the images myself and replaced them, but kube-proxy still has the same issue. As for kube-flannel, it has a different error. Below is the error I encountered.

Log error of kube-flannel:

Copying SDN CNI binaries to host

    Directory: C:\opt\cni

Mode                LastWriteTime         Length Name                                                                  
----                -------------         ------ ----                                                                  
d-----       10/16/2023  11:06 PM                bin                                                                   
copy flannel config

    Directory: C:\etc

Mode                LastWriteTime         Length Name                                                                  
----                -------------         ------ ----                                                                  
d-----       10/17/2023  12:42 AM                kube-flannel                                                          

    Directory: C:\etc\kube-flannel

Mode                LastWriteTime         Length Name                                                                  
----                -------------         ------ ----                                                                  
-a----       10/17/2023  12:41 AM            110 net-conf.json                                                         

    Directory: C:\hpc\mounts\kube-flannel

Mode                LastWriteTime         Length Name                                                                  
----                -------------         ------ ----                                                                  
d-----       10/17/2023  12:53 AM                ..2023_10_17_07_53_18.4275467535                                      
d----l       10/17/2023  12:53 AM                ..data                                                                
-a---l       10/17/2023  12:53 AM              0 cni-conf.json                                                         
-a---l       10/17/2023  12:53 AM              0 net-conf.json                                                         
update cni config

    Directory: C:\etc\cni

Mode                LastWriteTime         Length Name                                                                  
----                -------------         ------ ----                                                                  
d-----       10/17/2023  12:42 AM                net.d                                                                 
add route
The route addition failed: The object already exists.

envs
kube-flannel-ds-windows-amd64-b7hjz
kube-flannel
Starting flannel
I1017 00:53:26.522995     548 main.go:211] CLI flags config: {etcdEndpoints:http://127.0.0.1:4001,http://127.0.0.1:2379 etcdPrefix:/coreos.com/network etcdKeyfile: etcdCertfile: etcdCAFile: etcdUsername: etcdPassword: version:false kubeSubnetMgr:true kubeApiUrl: kubeAnnotationPrefix:flannel.alpha.coreos.com kubeConfigFile: iface:[172.16.68.170] ifaceRegex:[] ipMasq:false ifaceCanReach: subnetFile:/run/flannel/subnet.env publicIP: publicIPv6: subnetLeaseRenewMargin:60 healthzIP:0.0.0.0 healthzPort:0 iptablesResyncSeconds:5 iptablesForwardRules:true netConfPath:/etc/kube-flannel/net-conf.json setNodeNetworkUnavailable:true useMultiClusterCidr:false}
W1017 00:53:26.550040     548 client_config.go:617] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
I1017 00:53:27.397331     548 kube.go:485] Starting kube subnet manager
I1017 00:53:27.402637     548 kube.go:144] Waiting 10m0s for node controller to sync
I1017 00:53:27.448965     548 kube.go:506] Creating the node lease for IPv4. This is the n.Spec.PodCIDRs: [192.168.69.0/24]
I1017 00:53:28.416069     548 kube.go:151] Node controller sync successful
I1017 00:53:28.416069     548 main.go:231] Created subnet manager: Kubernetes Subnet Manager - win-9r639hv9q5r
I1017 00:53:28.416069     548 main.go:234] Installing signal handlers
I1017 00:53:28.416509     548 main.go:542] Found network config - Backend type: vxlan
I1017 00:53:28.426134     548 match.go:73] Searching for interface using 172.16.68.170
I1017 00:53:28.430973     548 match.go:259] Using interface with name Ethernet0 and address 172.16.68.170
I1017 00:53:28.430973     548 match.go:281] Defaulting external address to interface address (172.16.68.170)
I1017 00:53:28.431658     548 vxlan_windows.go:125] VXLAN config: Name=flannel.4096 MacPrefix=0E-2A VNI=4096 Port=4789 GBP=false DirectRouting=false
time="2023-10-17T00:53:28-07:00" level=info msg="HCN feature check" supportedFeatures="{Acl:{AclAddressLists:true AclNoHostRulePriority:true AclPortRanges:true AclRuleId:true} Api:{V1:true V2:true} RemoteSubnet:true HostRoute:true DSR:true Slash32EndpointPrefixes:true AclSupportForProtocol252:false SessionAffinity:false IPv6DualStack:false SetPolicy:false VxlanPort:false L4Proxy:true L4WfpProxy:false TierAcl:false NetworkACL:false NestedIpSet:false}" version="{Major:9 Minor:5}"
E1017 00:53:28.505971     548 main.go:334] Error registering network: failed to acquire lease: node "win-9r639hv9q5r" pod cidr not assigned
W1017 00:53:28.506157     548 reflector.go:347] pkg/subnet/kube/kube.go:486: watch of *v1.Node ended with: an error on the server ("unable to decode an event from the watch stream: context canceled") has prevented the request from succeeding
I1017 00:53:28.506157     548 main.go:522] Stopping shutdownHandler...

Log error of Kube-proxy:

Copying SDN CNI binaries to host

    Directory: C:\opt\cni

Mode                LastWriteTime         Length Name                                                                  
----                -------------         ------ ----                                                                  
d-----       10/16/2023  11:06 PM                bin                                                                   
copy flannel config

    Directory: C:\etc

Mode                LastWriteTime         Length Name                                                                  
----                -------------         ------ ----                                                                  
d-----       10/17/2023  12:42 AM                kube-flannel                                                          

    Directory: C:\etc\kube-flannel

Mode                LastWriteTime         Length Name                                                                  
----                -------------         ------ ----                                                                  
-a----       10/17/2023  12:41 AM            110 net-conf.json                                                         
ls : Cannot find path 'C:\hpc\mounts\kube-flannel\' because it does not exist.
At C:\hpc\flannel\start.ps1:11 char:1
+ ls $env:CONTAINER_SANDBOX_MOUNT_POINT/mounts/kube-flannel/
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : ObjectNotFound: (C:\hpc\mounts\kube-flannel\:String) [Get-ChildItem], ItemNotFoundExcept 
   ion
    + FullyQualifiedErrorId : PathNotFound,Microsoft.PowerShell.Commands.GetChildItemCommand

Thank you very much.

shivangbhar commented 10 months ago

@aquynh1682 is flannel windows running? what is the output of :

kubectl get pods -n kube-system

aquynh1682 commented 10 months ago

Hi @shivangbhar,

At first, Flannel on Windows was working, but later it stopped and reported this error:

I1017 00:53:28.430973     548 match.go:281] Defaulting external address to interface address (172.16.68.170)
I1017 00:53:28.431658     548 vxlan_windows.go:125] VXLAN config: Name=flannel.4096 MacPrefix=0E-2A VNI=4096 Port=4789 GBP=false DirectRouting=false
time="2023-10-17T00:53:28-07:00" level=info msg="HCN feature check" supportedFeatures="{Acl:{AclAddressLists:true AclNoHostRulePriority:true AclPortRanges:true AclRuleId:true} Api:{V1:true V2:true} RemoteSubnet:true HostRoute:true DSR:true Slash32EndpointPrefixes:true AclSupportForProtocol252:false SessionAffinity:false IPv6DualStack:false SetPolicy:false VxlanPort:false L4Proxy:true L4WfpProxy:false TierAcl:false NetworkACL:false NestedIpSet:false}" version="{Major:9 Minor:5}"
E1017 00:53:28.505971     548 main.go:334] Error registering network: failed to acquire lease: node "win-9r639hv9q5r" pod cidr not assigned
W1017 00:53:28.506157     548 reflector.go:347] pkg/subnet/kube/kube.go:486: watch of *v1.Node ended with: an error on the server ("unable to decode an event from the watch stream: context canceled") has prevented the request from succeeding
I1017 00:53:28.506157     548 main.go:522] Stopping shutdownHandler...

And this is output of kubectl get pod -n kube-system and kubectl get pod -n kube-flannel:

NAME                                    READY   STATUS             RESTARTS        AGE
coredns-64897985d-92gtz                 1/1     Running            0               5h53m
coredns-64897985d-n424m                 1/1     Running            0               5h53m
etcd-k8s-master-01                      1/1     Running            2               5h53m
kube-apiserver-k8s-master-01            1/1     Running            2               5h53m
kube-controller-manager-k8s-master-01   1/1     Running            2               5h53m
kube-proxy-tm6fk                        1/1     Running            0               5h53m
kube-proxy-windows-f6tkf                0/1     CrashLoopBackOff   17 (101s ago)   64m
kube-scheduler-k8s-master-01            1/1     Running            2               5h53m
NAME                                  READY   STATUS             RESTARTS         AGE
kube-flannel-ds-69f9p                 1/1     Running            0                3h51m
kube-flannel-ds-windows-amd64-b7hjz   0/1     CrashLoopBackOff   15 (3m29s ago)   57m
shivangbhar commented 10 months ago

Try deploying flannel pods to kube-system namespace. @aquynh1682

aquynh1682 commented 10 months ago

Hi @shivangbhar, I followed your instructions, but the Flannel pod still reported this error.

Copying SDN CNI binaries to host

    Directory: C:\opt\cni

Mode                LastWriteTime         Length Name                                                                  
----                -------------         ------ ----                                                                  
d-----       10/16/2023  11:06 PM                bin                                                                   
copy flannel config

    Directory: C:\etc

Mode                LastWriteTime         Length Name                                                                  
----                -------------         ------ ----                                                                  
d-----       10/17/2023  12:42 AM                kube-flannel                                                          

    Directory: C:\etc\kube-flannel

Mode                LastWriteTime         Length Name                                                                  
----                -------------         ------ ----                                                                  
-a----       10/17/2023   2:04 AM            110 net-conf.json                                                         

    Directory: C:\hpc\mounts\kube-flannel

Mode                LastWriteTime         Length Name                                                                  
----                -------------         ------ ----                                                                  
d-----       10/17/2023   2:04 AM                ..2023_10_17_09_04_36.3325648970                                      
d----l       10/17/2023   2:04 AM                ..data                                                                
-a---l       10/17/2023   2:04 AM              0 cni-conf.json                                                         
-a---l       10/17/2023   2:04 AM              0 net-conf.json                                                         
update cni config

    Directory: C:\etc\cni

Mode                LastWriteTime         Length Name                                                                  
----                -------------         ------ ----                                                                  
d-----       10/17/2023  12:42 AM                net.d                                                                 
add route
The route addition failed: The object already exists.

envs
kube-flannel-ds-windows-amd64-vpb4j
kube-system
Starting flannel
I1017 02:08:17.214810    1832 main.go:211] CLI flags config: {etcdEndpoints:http://127.0.0.1:4001,http://127.0.0.1:2379 etcdPrefix:/coreos.com/network etcdKeyfile: etcdCertfile: etcdCAFile: etcdUsername: etcdPassword: version:false kubeSubnetMgr:true kubeApiUrl: kubeAnnotationPrefix:flannel.alpha.coreos.com kubeConfigFile: iface:[172.16.68.170] ifaceRegex:[] ipMasq:false ifaceCanReach: subnetFile:/run/flannel/subnet.env publicIP: publicIPv6: subnetLeaseRenewMargin:60 healthzIP:0.0.0.0 healthzPort:0 iptablesResyncSeconds:5 iptablesForwardRules:true netConfPath:/etc/kube-flannel/net-conf.json setNodeNetworkUnavailable:true useMultiClusterCidr:false}
W1017 02:08:17.220952    1832 client_config.go:617] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
E1017 02:08:17.395168    1832 main.go:228] Failed to create SubnetManager: error retrieving pod spec for 'kube-system/kube-flannel-ds-windows-amd64-vpb4j': Get "https://172.16.68.174:644364436443644364436443/api/v1/namespaces/kube-system/pods/kube-flannel-ds-windows-amd64-vpb4j": dial tcp: address 644364436443644364436443: invalid port
aquynh1682 commented 10 months ago

ah sorry, issue of me.

shivangbhar commented 10 months ago

@aquynh1682 I was seeing the same issue, and I updated the containerd version to 1.7.4. It worked for me after that. But I certainly installed flannel pods in kube-system ns to make it work.

aquynh1682 commented 10 months ago

@shivangbhar, I've deploying flannel pods kube-system, but the error log remains the same:

[root@k8s-master-01 kube-system]# kubectl get pod -A
NAMESPACE     NAME                                    READY   STATUS             RESTARTS      AGE
kube-system   coredns-64897985d-92gtz                 1/1     Running            0             6h15m
kube-system   coredns-64897985d-n424m                 1/1     Running            0             6h15m
kube-system   etcd-k8s-master-01                      1/1     Running            2             6h15m
kube-system   kube-apiserver-k8s-master-01            1/1     Running            2             6h15m
kube-system   kube-controller-manager-k8s-master-01   1/1     Running            2             6h15m
kube-system   kube-flannel-ds-kxffg                   1/1     Running            0             6m57s
kube-system   kube-flannel-ds-windows-amd64-mlf27     0/1     CrashLoopBackOff   3 (19s ago)   96s
kube-system   kube-proxy-tm6fk                        1/1     Running            0             6h15m
kube-system   kube-proxy-windows-bzcf8                0/1     CrashLoopBackOff   5 (95s ago)   5m15s
kube-system   kube-scheduler-k8s-master-01            1/1     Running            2             6h15m
[root@k8s-master-01 kube-system]# kubectl logs -f -n kube-system kube-flannel-ds-windows-amd64-mlf27
Copying SDN CNI binaries to host

    Directory: C:\opt\cni

Mode                LastWriteTime         Length Name                                                                  
----                -------------         ------ ----                                                                  
d-----       10/16/2023  11:06 PM                bin                                                                   
copy flannel config

    Directory: C:\etc

Mode                LastWriteTime         Length Name                                                                  
----                -------------         ------ ----                                                                  
d-----       10/17/2023  12:42 AM                kube-flannel                                                          

    Directory: C:\etc\kube-flannel

Mode                LastWriteTime         Length Name                                                                  
----                -------------         ------ ----                                                                  
-a----       10/17/2023   2:09 AM            110 net-conf.json                                                         

    Directory: C:\hpc\mounts\kube-flannel

Mode                LastWriteTime         Length Name                                                                  
----                -------------         ------ ----                                                                  
d-----       10/17/2023   2:09 AM                ..2023_10_17_09_09_43.3520444326                                      
d----l       10/17/2023   2:09 AM                ..data                                                                
-a---l       10/17/2023   2:09 AM              0 cni-conf.json                                                         
-a---l       10/17/2023   2:09 AM              0 net-conf.json                                                         
update cni config

    Directory: C:\etc\cni

Mode                LastWriteTime         Length Name                                                                  
----                -------------         ------ ----                                                                  
d-----       10/17/2023  12:42 AM                net.d                                                                 
add route
The route addition failed: The object already exists.

envs
kube-flannel-ds-windows-amd64-mlf27
kube-system
Starting flannel
I1017 02:12:01.110704    5776 main.go:211] CLI flags config: {etcdEndpoints:http://127.0.0.1:4001,http://127.0.0.1:2379 etcdPrefix:/coreos.com/network etcdKeyfile: etcdCertfile: etcdCAFile: etcdUsername: etcdPassword: version:false kubeSubnetMgr:true kubeApiUrl: kubeAnnotationPrefix:flannel.alpha.coreos.com kubeConfigFile: iface:[172.16.68.170] ifaceRegex:[] ipMasq:false ifaceCanReach: subnetFile:/run/flannel/subnet.env publicIP: publicIPv6: subnetLeaseRenewMargin:60 healthzIP:0.0.0.0 healthzPort:0 iptablesResyncSeconds:5 iptablesForwardRules:true netConfPath:/etc/kube-flannel/net-conf.json setNodeNetworkUnavailable:true useMultiClusterCidr:false}
W1017 02:12:01.122713    5776 client_config.go:617] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
I1017 02:12:01.473406    5776 kube.go:485] Starting kube subnet manager
I1017 02:12:01.487293    5776 kube.go:144] Waiting 10m0s for node controller to sync
I1017 02:12:01.516934    5776 kube.go:506] Creating the node lease for IPv4. This is the n.Spec.PodCIDRs: [192.168.69.0/24]
I1017 02:12:02.491557    5776 kube.go:151] Node controller sync successful
I1017 02:12:02.491674    5776 main.go:231] Created subnet manager: Kubernetes Subnet Manager - win-9r639hv9q5r
I1017 02:12:02.491674    5776 main.go:234] Installing signal handlers
I1017 02:12:02.491674    5776 main.go:542] Found network config - Backend type: vxlan
I1017 02:12:02.499697    5776 match.go:73] Searching for interface using 172.16.68.170
I1017 02:12:02.509297    5776 match.go:259] Using interface with name Ethernet0 and address 172.16.68.170
I1017 02:12:02.509297    5776 match.go:281] Defaulting external address to interface address (172.16.68.170)
I1017 02:12:02.509297    5776 vxlan_windows.go:125] VXLAN config: Name=flannel.4096 MacPrefix=0E-2A VNI=4096 Port=4789 GBP=false DirectRouting=false
time="2023-10-17T02:12:02-07:00" level=info msg="HCN feature check" supportedFeatures="{Acl:{AclAddressLists:true AclNoHostRulePriority:true AclPortRanges:true AclRuleId:true} Api:{V1:true V2:true} RemoteSubnet:true HostRoute:true DSR:true Slash32EndpointPrefixes:true AclSupportForProtocol252:false SessionAffinity:false IPv6DualStack:false SetPolicy:false VxlanPort:false L4Proxy:true L4WfpProxy:false TierAcl:false NetworkACL:false NestedIpSet:false}" version="{Major:9 Minor:5}"
E1017 02:12:02.534374    5776 main.go:334] Error registering network: failed to acquire lease: node "win-9r639hv9q5r" pod cidr not assigned
W1017 02:12:02.534564    5776 reflector.go:347] pkg/subnet/kube/kube.go:486: watch of *v1.Node ended with: an error on the server ("unable to decode an event from the watch stream: context canceled") has prevented the request from succeeding
I1017 02:12:02.534564    5776 main.go:522] Stopping shutdownHandler...
shivangbhar commented 10 months ago

containerd version ? @aquynh1682

aquynh1682 commented 10 months ago

I've using containerd version 1.7.1 for windows node and containerd version 1.6.24 for master node. @shivangbhar

shivangbhar commented 10 months ago

try 1.7.4 for both. @aquynh1682

aquynh1682 commented 10 months ago

Ok i will try updating Containerd for both nodes. @shivangbhar

aquynh1682 commented 10 months ago

Hi @shivangbhar,

I have tried upgrading Containerd to version 1.7.4 on both the master and worker nodes. Also, for kube-flannel running in the kube-system namespace. However, I am still encountering the same error as before. Below is the error log I am still facing.

Node in kubernetes cluster:

NAME              STATUS     ROLES                  AGE     VERSION   INTERNAL-IP     EXTERNAL-IP   OS-IMAGE                         KERNEL-VERSION                   CONTAINER-RUNTIME
k8s-master        Ready      control-plane,master   57m     v1.23.6   172.16.68.171   <none>        Oracle Linux Server 8.8          5.15.0-106.131.4.el8uek.x86_64   containerd://1.7.4
win-c1n3jmvf42g   NotReady   <none>                 5m11s   v1.23.6   172.16.68.175   <none>        Windows Server 2019 Datacenter   10.0.17763.3650                  containerd://1.7.4

Pods in my cluster:

[root@k8s-master flannel]# kubectl get pod -A
NAMESPACE     NAME                                  READY   STATUS             RESTARTS        AGE
kube-system   coredns-64897985d-mnzkm               1/1     Running            0               64m
kube-system   coredns-64897985d-pncbq               1/1     Running            0               64m
kube-system   etcd-k8s-master                       1/1     Running            0               65m
kube-system   kube-apiserver-k8s-master             1/1     Running            0               65m
kube-system   kube-controller-manager-k8s-master    1/1     Running            0               65m
kube-system   kube-flannel-ds-hmm69                 1/1     Running            0               61m
kube-system   kube-flannel-ds-windows-amd64-jvgr5   0/1     CrashLoopBackOff   6 (2m36s ago)   9m42s
kube-system   kube-proxy-fx97w                      1/1     Running            0               64m
kube-system   kube-proxy-windows-mbcmv              0/1     Error              7 (5m14s ago)   12m
kube-system   kube-scheduler-k8s-master             1/1     Running            0               65m

Log error kube-proxy:

Copying SDN CNI binaries to host

    Directory: C:\opt\cni

Mode                LastWriteTime         Length Name                                                                  
----                -------------         ------ ----                                                                  
d-----       10/17/2023  10:27 AM                bin                                                                   
copy flannel config

    Directory: C:\etc

Mode                LastWriteTime         Length Name                                                                  
----                -------------         ------ ----                                                                  
d-----       10/17/2023  10:27 AM                kube-flannel                                                          

    Directory: C:\etc\kube-flannel

Mode                LastWriteTime         Length Name                                                                  
----                -------------         ------ ----                                                                  
-a----       10/17/2023  10:29 AM            110 net-conf.json                                                         
ls : Cannot find path 'C:\hpc\mounts\kube-flannel\' because it does not exist.
At C:\hpc\flannel\start.ps1:11 char:1
+ ls $env:CONTAINER_SANDBOX_MOUNT_POINT/mounts/kube-flannel/
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : ObjectNotFound: (C:\hpc\mounts\kube-flannel\:String) [Get-ChildItem], ItemNotFoundExcept 
   ion
    + FullyQualifiedErrorId : PathNotFound,Microsoft.PowerShell.Commands.GetChildItemCommand

Log error kube-flannel:

Copying SDN CNI binaries to host

    Directory: C:\opt\cni

Mode                LastWriteTime         Length Name                                                                  
----                -------------         ------ ----                                                                  
d-----       10/17/2023  10:27 AM                bin                                                                   
copy flannel config

    Directory: C:\etc

Mode                LastWriteTime         Length Name                                                                  
----                -------------         ------ ----                                                                  
d-----       10/17/2023  10:27 AM                kube-flannel                                                          

    Directory: C:\etc\kube-flannel

Mode                LastWriteTime         Length Name                                                                  
----                -------------         ------ ----                                                                  
-a----       10/17/2023  10:29 AM            110 net-conf.json                                                         

    Directory: C:\hpc\mounts\kube-flannel

Mode                LastWriteTime         Length Name                                                                  
----                -------------         ------ ----                                                                  
d-----       10/17/2023  10:40 AM                ..2023_10_17_17_40_48.1673146896                                      
d----l       10/17/2023  10:40 AM                ..data                                                                
-a---l       10/17/2023  10:40 AM              0 cni-conf.json                                                         
-a---l       10/17/2023  10:40 AM              0 net-conf.json                                                         
update cni config

    Directory: C:\etc\cni

Mode                LastWriteTime         Length Name                                                                  
----                -------------         ------ ----                                                                  
d-----       10/17/2023  10:27 AM                net.d                                                                 
add route
The route addition failed: The object already exists.

envs
kube-flannel-ds-windows-amd64-s2td9
kube-system
Starting flannel
I1017 10:40:56.012084    4884 main.go:211] CLI flags config: {etcdEndpoints:http://127.0.0.1:4001,http://127.0.0.1:2379 etcdPrefix:/coreos.com/network etcdKeyfile: etcdCertfile: etcdCAFile: etcdUsername: etcdPassword: version:false kubeSubnetMgr:true kubeApiUrl: kubeAnnotationPrefix:flannel.alpha.coreos.com kubeConfigFile: iface:[172.16.68.175] ifaceRegex:[] ipMasq:false ifaceCanReach: subnetFile:/run/flannel/subnet.env publicIP: publicIPv6: subnetLeaseRenewMargin:60 healthzIP:0.0.0.0 healthzPort:0 iptablesResyncSeconds:5 iptablesForwardRules:true netConfPath:/etc/kube-flannel/net-conf.json setNodeNetworkUnavailable:true useMultiClusterCidr:false}
W1017 10:40:56.019023    4884 client_config.go:617] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
I1017 10:40:56.629044    4884 kube.go:485] Starting kube subnet manager
I1017 10:40:56.636811    4884 kube.go:144] Waiting 10m0s for node controller to sync
I1017 10:40:56.670442    4884 kube.go:506] Creating the node lease for IPv4. This is the n.Spec.PodCIDRs: [192.168.69.0/24]
I1017 10:40:57.638733    4884 kube.go:151] Node controller sync successful
I1017 10:40:57.638733    4884 main.go:231] Created subnet manager: Kubernetes Subnet Manager - win-c1n3jmvf42g
I1017 10:40:57.638733    4884 main.go:234] Installing signal handlers
I1017 10:40:57.638733    4884 main.go:542] Found network config - Backend type: vxlan
I1017 10:40:57.649126    4884 match.go:73] Searching for interface using 172.16.68.175
I1017 10:40:57.654111    4884 match.go:259] Using interface with name Ethernet0 and address 172.16.68.175
I1017 10:40:57.654111    4884 match.go:281] Defaulting external address to interface address (172.16.68.175)
I1017 10:40:57.660423    4884 vxlan_windows.go:125] VXLAN config: Name=flannel.4096 MacPrefix=0E-2A VNI=4096 Port=4789 GBP=false DirectRouting=false
time="2023-10-17T10:40:57-07:00" level=info msg="HCN feature check" supportedFeatures="{Acl:{AclAddressLists:true AclNoHostRulePriority:true AclPortRanges:true AclRuleId:true} Api:{V1:true V2:true} RemoteSubnet:true HostRoute:true DSR:true Slash32EndpointPrefixes:true AclSupportForProtocol252:false SessionAffinity:false IPv6DualStack:false SetPolicy:false VxlanPort:false L4Proxy:true L4WfpProxy:false TierAcl:false NetworkACL:false NestedIpSet:false}" version="{Major:9 Minor:5}"
E1017 10:40:57.685290    4884 main.go:334] Error registering network: failed to acquire lease: node "win-c1n3jmvf42g" pod cidr not assigned
I1017 10:40:57.685395    4884 main.go:522] Stopping shutdownHandler...
Mik4sa commented 10 months ago

Which exact command was executed by you to initialize your cluster? Did you set the --pod-network-cidr 10.244.0.0/16 parameter?

aquynh1682 commented 10 months ago

Hi @Mik4sa,

I send you the command to initialize the cluster:

sudo kubeadm init --pod-network-cidr=192.168.69.0/24 --upload-certs --control-plane-endpoint=172.16.68.171 --apiserver-advertise-address 172.16.68.171 --kubernetes-version 1.23.6 --v=5

Does deploying this cluster require using the IP range 10.244.0.0/16?

aquynh1682 commented 10 months ago

I have checked the cluster, Windows node had changed its status to ready. However, kube-flannel and kube-proxy are still showing errors.

aquynh1682 commented 10 months ago

I just tried changing --pod-network-cidr from 192.168.69.0/24 to 10.224.0.0/16 and it seems like kube-flannel is now working. However, kube-proxy is still encountering an error.

Log of kube-flannel:

Copying SDN CNI binaries to host

    Directory: C:\opt\cni

Mode                LastWriteTime         Length Name                                                                  
----                -------------         ------ ----                                                                  
d-----       10/17/2023  10:27 AM                bin                                                                   
copy flannel config

    Directory: C:\etc

Mode                LastWriteTime         Length Name                                                                  
----                -------------         ------ ----                                                                  
d-----       10/17/2023  10:27 AM                kube-flannel                                                          

    Directory: C:\etc\kube-flannel

Mode                LastWriteTime         Length Name                                                                  
----                -------------         ------ ----                                                                  
-a----       10/17/2023   7:45 PM            108 net-conf.json                                                         

    Directory: C:\hpc\mounts\kube-flannel

Mode                LastWriteTime         Length Name                                                                  
----                -------------         ------ ----                                                                  
d-----       10/17/2023   7:45 PM                ..2023_10_18_02_45_56.1244681699                                      
d----l       10/17/2023   7:45 PM                ..data                                                                
-a---l       10/17/2023   7:45 PM              0 cni-conf.json                                                         
-a---l       10/17/2023   7:45 PM              0 net-conf.json                                                         
update cni config

    Directory: C:\etc\cni

Mode                LastWriteTime         Length Name                                                                  
----                -------------         ------ ----                                                                  
d-----       10/17/2023  10:27 AM                net.d                                                                 
add route
The route addition failed: The object already exists.

envs
kube-flannel-ds-windows-amd64-jtl8q
kube-flannel
Starting flannel
I1017 19:46:42.315015    6040 main.go:211] CLI flags config: {etcdEndpoints:http://127.0.0.1:4001,http://127.0.0.1:2379 etcdPrefix:/coreos.com/network etcdKeyfile: etcdCertfile: etcdCAFile: etcdUsername: etcdPassword: version:false kubeSubnetMgr:true kubeApiUrl: kubeAnnotationPrefix:flannel.alpha.coreos.com kubeConfigFile: iface:[172.16.68.175] ifaceRegex:[] ipMasq:false ifaceCanReach: subnetFile:/run/flannel/subnet.env publicIP: publicIPv6: subnetLeaseRenewMargin:60 healthzIP:0.0.0.0 healthzPort:0 iptablesResyncSeconds:5 iptablesForwardRules:true netConfPath:/etc/kube-flannel/net-conf.json setNodeNetworkUnavailable:true useMultiClusterCidr:false}
W1017 19:46:42.324377    6040 client_config.go:617] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
I1017 19:46:42.708941    6040 kube.go:485] Starting kube subnet manager
I1017 19:46:42.713594    6040 kube.go:144] Waiting 10m0s for node controller to sync
I1017 19:46:42.790581    6040 kube.go:506] Creating the node lease for IPv4. This is the n.Spec.PodCIDRs: [10.244.0.0/24]
I1017 19:46:42.790581    6040 kube.go:506] Creating the node lease for IPv4. This is the n.Spec.PodCIDRs: [10.244.1.0/24]
I1017 19:46:43.717978    6040 kube.go:151] Node controller sync successful
I1017 19:46:43.717978    6040 main.go:231] Created subnet manager: Kubernetes Subnet Manager - win-c1n3jmvf42g
I1017 19:46:43.717978    6040 main.go:234] Installing signal handlers
I1017 19:46:43.717978    6040 main.go:542] Found network config - Backend type: vxlan
I1017 19:46:43.753824    6040 match.go:73] Searching for interface using 172.16.68.175
I1017 19:46:43.754391    6040 match.go:259] Using interface with name vEthernet (Ethernet0) and address 172.16.68.175
I1017 19:46:43.754391    6040 match.go:281] Defaulting external address to interface address (172.16.68.175)
I1017 19:46:43.754391    6040 vxlan_windows.go:125] VXLAN config: Name=flannel.4096 MacPrefix=0E-2A VNI=4096 Port=4789 GBP=false DirectRouting=false
time="2023-10-17T19:46:43-07:00" level=info msg="HCN feature check" supportedFeatures="{Acl:{AclAddressLists:true AclNoHostRulePriority:true AclPortRanges:true AclRuleId:true} Api:{V1:true V2:true} RemoteSubnet:true HostRoute:true DSR:true Slash32EndpointPrefixes:true AclSupportForProtocol252:false SessionAffinity:false IPv6DualStack:false SetPolicy:false VxlanPort:false L4Proxy:true L4WfpProxy:false TierAcl:false NetworkACL:false NestedIpSet:false}" version="{Major:9 Minor:5}"
I1017 19:46:43.788562    6040 device_windows.go:103] Found existing HostComputeNetwork flannel.4096
I1017 19:46:43.830004    6040 main.go:407] Changing default FORWARD chain policy to ACCEPT
I1017 19:46:43.834606    6040 main.go:435] Wrote subnet file to /run/flannel/subnet.env
I1017 19:46:43.834606    6040 kube.go:506] Creating the node lease for IPv4. This is the n.Spec.PodCIDRs: [10.244.1.0/24]
I1017 19:46:43.834606    6040 main.go:439] Running backend.
I1017 19:46:43.834606    6040 vxlan_network_windows.go:62] Watching for new subnet leases
I1017 19:46:43.834606    6040 watch.go:51] Batch elem [0] is { subnet.Event{Type:0, Lease:subnet.Lease{EnableIPv4:true, EnableIPv6:false, Subnet:ip.IP4Net{IP:0xaf40000, PrefixLen:0x18}, IPv6Subnet:ip.IP6Net{IP:(*ip.IP6)(nil), PrefixLen:0x0}, Attrs:subnet.LeaseAttrs{PublicIP:0xac1044ab, PublicIPv6:(*ip.IP6)(nil), BackendType:"vxlan", BackendData:json.RawMessage{0x7b, 0x22, 0x56, 0x4e, 0x49, 0x22, 0x3a, 0x34, 0x30, 0x39, 0x36, 0x2c, 0x22, 0x56, 0x74, 0x65, 0x70, 0x4d, 0x41, 0x43, 0x22, 0x3a, 0x22, 0x32, 0x36, 0x3a, 0x62, 0x64, 0x3a, 0x65, 0x65, 0x3a, 0x37, 0x37, 0x3a, 0x66, 0x39, 0x3a, 0x35, 0x66, 0x22, 0x7d}, BackendV6Data:json.RawMessage(nil)}, Expiration:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Asof:0}} }
I1017 19:46:43.844596    6040 main.go:460] Waiting for all goroutines to exit

Log of kube-proxy:

Copying SDN CNI binaries to host

    Directory: C:\opt\cni

Mode                LastWriteTime         Length Name                                                                  
----                -------------         ------ ----                                                                  
d-----       10/17/2023  10:27 AM                bin                                                                   
copy flannel config

    Directory: C:\etc

Mode                LastWriteTime         Length Name                                                                  
----                -------------         ------ ----                                                                  
d-----       10/17/2023  10:27 AM                kube-flannel                                                          

    Directory: C:\etc\kube-flannel

Mode                LastWriteTime         Length Name                                                                  
----                -------------         ------ ----                                                                  
-a----       10/17/2023   7:45 PM            108 net-conf.json                                                         
ls : Cannot find path 'C:\hpc\mounts\kube-flannel\' because it does not exist.
At C:\hpc\flannel\start.ps1:11 char:1
+ ls $env:CONTAINER_SANDBOX_MOUNT_POINT/mounts/kube-flannel/
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : ObjectNotFound: (C:\hpc\mounts\kube-flannel\:String) [Get-ChildItem], ItemNotFoundExcept 
   ion
    + FullyQualifiedErrorId : PathNotFound,Microsoft.PowerShell.Commands.GetChildItemCommand
Mik4sa commented 10 months ago

I think your Kubernetes version is too old. HPC (host process container) was introduced/mark as stable in a later release. But I don't know which. Can you update your cluster to a newer version? I personally use 1.27.x

Check this: https://github.com/kubernetes-sigs/sig-windows-tools/blob/e35b2e1622f3039a70956d4173faaceb067bc613/hostprocess/README.md?plain=1#L4 Maybe you try atleast 1.26.x

aquynh1682 commented 10 months ago

Hi @Mik4sa, I've updated Kubernetes to version 1.27.6, and I've rebuilt the proxy images to match the correct version for 1.27.6. However, when running kube-proxy, I encountered another error. I believe this error is occurring when trying to execute the kube-proxy.exe file, but I can't seem to locate it anywhere on the Windows node.

My cluster has update:

$kubectl get node -o wide
NAME               STATUS   ROLES           AGE   VERSION   INTERNAL-IP     EXTERNAL-IP   OS-IMAGE                         KERNEL-VERSION                     CONTAINER-RUNTIME
k8s-master         Ready    control-plane   48m   v1.27.6   172.16.68.174   <none>        Oracle Linux Server 8.8          5.15.0-101.103.2.1.el8uek.x86_64   containerd://1.7.4
k8s-windows-node   Ready    <none>          41m   v1.27.6   172.16.68.170   <none>        Windows Server 2019 Datacenter   10.0.17763.3650                    containerd://1.7.4

Log error kube-system:

kubectl get pod -A
NAMESPACE      NAME                                  READY   STATUS             RESTARTS      AGE
kube-flannel   kube-flannel-ds-kwmrv                 1/1     Running            0             7m51s
kube-flannel   kube-flannel-ds-windows-amd64-bqhkj   1/1     Running            0             7m46s
kube-system    coredns-5d78c9869d-p6lcl              1/1     Running            0             49m
kube-system    coredns-5d78c9869d-xbsw2              1/1     Running            0             49m
kube-system    etcd-k8s-master                       1/1     Running            0             50m
kube-system    kube-apiserver-k8s-master             1/1     Running            0             50m
kube-system    kube-controller-manager-k8s-master    1/1     Running            0             50m
kube-system    kube-proxy-pft8w                      1/1     Running            0             49m
kube-system    kube-proxy-windows-bjfm7              0/1     CrashLoopBackOff   6 (83s ago)   7m34s
kube-system    kube-scheduler-k8s-master             1/1     Running            0             50m
kubectl logs -f -n kube-system kube-proxy-windows-bjfm7
Write files so the kubeconfig points to correct locations

    Directory: C:\var\lib

Mode                LastWriteTime         Length Name                                                                  
----                -------------         ------ ----                                                                  
d-----       10/18/2023   1:57 AM                kube-proxy                                                            
Finding sourcevip
sourceip: 10.244.1.2
Starting C:\hpc\/kube-proxy/kube-proxy.exe --v=6 --hostname-override=k8s-windows-node --feature-gates=WinOverlay=true --proxy-mode=kernelspace --source-vip=10.244.1.2 --kubeconfig=C:\hpc\/mounts/var/lib/kube-proxy/kubeconfig-win.conf
Invoke-Expression : Program 'kube-proxy.exe' failed to run: The file or directory is corrupted and unreadableAt line:1 
char:1
+ C:\hpc\/kube-proxy/kube-proxy.exe --v=6 --hostname-override=k8s-windo ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~.
At C:\hpc\kube-proxy\start.ps1:72 char:1
+ Invoke-Expression $exe
+ ~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : ResourceUnavailable: (:) [Invoke-Expression], ApplicationFailedException
    + FullyQualifiedErrorId : NativeCommandFailed,Microsoft.PowerShell.Commands.InvokeExpressionCommand
Mik4sa commented 10 months ago

Hmm strange. Never heard of that. Maybe you wanna try to rebuild your image?

aquynh1682 commented 10 months ago

When I rebuild the images kube-proxy for kubernetes version 1.27.6, it still throws the same error as before.

aquynh1682 commented 10 months ago

I have fixed kube-proxy and it's working now. I think the issue was with the path in the start.ps1 script of the kube-proxy image when calling the kube-proxy.exe file. So I added a '\' and it started working.

Mode LastWriteTime Length Name


d----- 10/18/2023 1:57 AM kube-proxy
Finding sourcevip sourceip: 10.244.1.2 Starting C:\hpc\/kube-proxy\/kube-proxy.exe --v=6 --hostname-override=k8s-windows-node --feature-gates=WinOverlay=true --proxy-mode=kernelspace --source-vip=10.244.1.2 --kubeconfig=C:\hpc\/mounts/var/lib/kube-proxy/kubeconfig-win.conf I1018 20:45:03.445715 3580 flags.go:64] FLAG: --bind-address="0.0.0.0" I1018 20:45:03.446432 3580 flags.go:64] FLAG: --bind-address-hard-fail="false" I1018 20:45:03.446432 3580 flags.go:64] FLAG: --cleanup="false" I1018 20:45:03.446432 3580 flags.go:64] FLAG: --cluster-cidr="" I1018 20:45:03.446432 3580 flags.go:64] FLAG: --config="" ....

aquynh1682 commented 10 months ago

Thanks very much @Mik4sa and @shivangbhar, because your helped me create a windows node.

shivangbhar commented 10 months ago

@Mik4sa I read somewhere that Flannel is not supported for Kubernetes 1.28 or later, but I did not find it in any docs. If so, can you please point me to documentation?

Mik4sa commented 10 months ago

I haven't heard or read of that, yet. But I don't have an association with Flannel, so I just don't know. Also I haven't tried K8s 1.28 at all yet. This will take some months before I can try so sadly.

Mik4sa commented 8 months ago

@shivangbhar I just got my hands on Kubernetes 1.28.4 and this works so far with flannel. Note that I'm using the latest flannel version though. Haven't tried with older (like 0.21.5 as before). These are my latest changes which work with 1.28.4 so far: https://github.com/kubernetes-sigs/sig-windows-tools/compare/master...Mik4sa:sig-windows-tools:updated-all-versions

EDIT: Kubernetes 1.29.0 works too

k8s-ap commented 1 month ago

@shivangbhar I just got my hands on Kubernetes 1.28.4 and this works so far with flannel. Note that I'm using the latest flannel version though. Haven't tried with older (like 0.21.5 as before). These are my latest changes which work with 1.28.4 so far: master...Mik4sa:sig-windows-tools:updated-all-versions

EDIT: Kubernetes 1.29.0 works too

Hi Mik4sa,

Thank you very much for your valuable contribution. In my case, I am using Kubernetes v1.28.2 and the daemonset-flannel-windows with the customized Dockerfile with the Flannel version you recommended, but the same error persists:

Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work. error creating inClusterConfig, falling back to default config: open /var/run/secrets/kubernetes.io/serviceaccount/token: The system cannot find the path specified. Failed to create SubnetManager: fail to create kubernetes config: invalid configuration: no configuration has been provided, try setting KUBERNETES_MASTER environment variable

Any suggestions to try to solve it? Thanks in advance.

Mik4sa commented 1 month ago

Not really actually. It's a while ago I did all this. But this error sounds like the serviceaccount wasn't mounted into your container. Which sounds really strange so it's probably something different. Maybe you can find something in the internet to that related?