kubernetes-sigs / sig-windows-tools

Repository for tools and artifacts related to the sig-windows charter in Kubernetes. Scripts to assist kubeadm and wincat and flannel will be hosted here.
Apache License 2.0
126 stars 123 forks source link

Join error NodeIP not found #32

Closed HugoRh closed 4 years ago

HugoRh commented 4 years ago

Hi,

I'm struggling with adding a windows node to my cluster. I'm following the procedure on https://kubernetes.io/docs/setup/production-environment/windows/user-guide-windows-nodes/

I'm running windows server 2019 (1809 build:17763.805) on a virtual machine , I plan on using vxlan ( and already patched my cluster accordingly).

Here's my Kubeclustervxlan.json

{
    "Cri" : {
        "Name" : "dockerd",
        "Images" : {
            "Pause" : "mcr.microsoft.com/k8s/core/pause:1.2.0",
            "Nanoserver" : "mcr.microsoft.com/windows/nanoserver:1809",
            "ServerCore" : "mcr.microsoft.com/windows/servercore:ltsc2019"
        }
    },
    "Cni" : {
        "Name" : "flannel",
        "Source" : [{ 
            "Name" : "flanneld",
            "Url" : "https://github.com/coreos/flannel/releases/download/v0.11.0/flanneld.exe"
            }
        ],
        "Plugin" : {
            "Name": "vxlan"
        },
        "InterfaceName" : "Ethernet"
    },
    "Kubernetes" : {
        "Source" : {
            "Release" : "1.16.3",
            "Url" : "https://dl.k8s.io/v1.16.3/kubernetes-node-windows-amd64.tar.gz"
        },
        "ControlPlane" : {
            "IpAddress" : "123.123.123.123",
            "Username" : "user",
            "KubeadmToken" : "wwwww.wwwwwwwwwww",
            "KubeadmCAHash" : "sha256:wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww"
        },
        "KubeProxy" : {
            "Gates" : "WinOverlay=true"
        },
        "Network" : {
            "ServiceCidr" : "10.96.0.0/12",
            "ClusterCidr" : "10.41.0.0/16"
        }
    },
    "Install" : {
        "Destination" : "D:\\Kubernetes"
    }
}

The install script does not seem to retrieve the needed info from the ethernet interface , but it does not prevent the install either :

PS D:\transit\sig-windows-tools-master\kubeadm> .\KubeCluster.ps1 -ConfigFile .\v1.16.0\Kubeclustervxlan.json -install
Downloaded [https://raw.githubusercontent.com/Microsoft/SDN/master/Kubernetes/windows/hns.psm1] => [D:\transit\sig-windows-tools-master\kubeadm\hns.psm1]
WARNING: The names of some imported commands from the module 'hns' include unapproved verbs that might make them less
discoverable. To find the commands with unapproved verbs, run the Import-Module command again with the Verbose
parameter. For a list of approved verbs, type Get-Verb.
Get-NetIPAddress : No matching MSFT_NetIPAddress objects found by CIM query for instances of the
ROOT/StandardCimv2/MSFT_NetIPAddress class on the  CIM server: SELECT * FROM MSFT_NetIPAddress  WHERE ((InterfaceAlias
LIKE 'vEthernet (Ethernet)')) AND ((AddressFamily = 2)). Verify query parameters and retry.
At D:\transit\sig-windows-tools-master\kubeadm\KubeClusterHelper.psm1:409 char:13
+     return (Get-NetIPAddress -InterfaceAlias "$InterfaceName" -Addres ...
+             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : ObjectNotFound: (MSFT_NetIPAddress:String) [Get-NetIPAddress], CimJobException
    + FullyQualifiedErrorId : CmdletizationQuery_NotFound,Get-NetIPAddress

Get-NetIPAddress : No matching MSFT_NetIPAddress objects found by CIM query for instances of the
ROOT/StandardCimv2/MSFT_NetIPAddress class on the  CIM server: SELECT * FROM MSFT_NetIPAddress  WHERE ((InterfaceAlias
LIKE 'vEthernet (Ethernet)')) AND ((AddressFamily = 2)). Verify query parameters and retry.
At D:\transit\sig-windows-tools-master\kubeadm\KubeClusterHelper.psm1:462 char:14
+ ...    $addr = (Get-NetIPAddress -InterfaceAlias "$InterfaceName" -Addres ...
+                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : ObjectNotFound: (MSFT_NetIPAddress:String) [Get-NetIPAddress], CimJobException
    + FullyQualifiedErrorId : CmdletizationQuery_NotFound,Get-NetIPAddress

ConvertTo-DecimalIP : Cannot bind argument to parameter 'IPAddress' because it is null.
At D:\transit\sig-windows-tools-master\kubeadm\KubeClusterHelper.psm1:465 char:40
+     $mgmtSubnet = (ConvertTo-DecimalIP $addr) -band (ConvertTo-Decima ...
+                                        ~~~~~
    + CategoryInfo          : InvalidData: (:) [ConvertTo-DecimalIP], ParameterBindingValidationException
    + FullyQualifiedErrorId : ParameterArgumentValidationErrorNullNotAllowed,ConvertTo-DecimalIP

ConvertTo-MaskLength : Cannot bind argument to parameter 'SubnetMask' because it is null.
At D:\transit\sig-windows-tools-master\kubeadm\KubeClusterHelper.psm1:467 char:48
+     return "$mgmtSubnet/$(ConvertTo-MaskLength $mask)"
+                                                ~~~~~
    + CategoryInfo          : InvalidData: (:) [ConvertTo-MaskLength], ParameterBindingValidationException
    + FullyQualifiedErrorId : ParameterArgumentValidationErrorNullNotAllowed,ConvertTo-MaskLength

    Directory: D:\Kubernetes

Mode                LastWriteTime         Length Name
----                -------------         ------ ----
d-----       12/12/2019  10:05 AM                logs
############################################
User Input
Destination       : D:\Kubernetes
Master            : 123.123.123.123
InterfaceName     : Ethernet
Cri               : dockerd
Cni               : flannel
NetworkPlugin     : vxlan
Release           : 1.16.3
MasterIp          : 123.123.123.123
ManagementIp      :
ManagementSubnet  : 0.0.0.0/
############################################
[DownloadFile] File D:\Kubernetes/kubernetes-node-windows-amd64.tar.gz already exists.
x kubernetes/
x kubernetes/node/
x kubernetes/node/bin/
x kubernetes/node/bin/kubelet.exe
x kubernetes/node/bin/kubectl.exe
x kubernetes/node/bin/kubeadm.exe
x kubernetes/node/bin/kube-proxy.exe
x kubernetes/LICENSES
x kubernetes/kubernetes-src.tar.gz
Downloading CNI binaries for overlay to D:\Kubernetes\cni
d-----       12/12/2019  10:05 AM                cni

    Directory: D:\Kubernetes\cni

Mode                LastWriteTime         Length Name
----                -------------         ------ ----
d-----       12/12/2019  10:05 AM                config
Downloading Flannel binaries
[DownloadFile] File D:\Kubernetes\flanneld.exe already exists.
[DownloadFile] File D:\Kubernetes/cni-plugins-windows-amd64-v0.8.2.tgz already exists.
x ./
x ./flannel.exe
x ./win-overlay.exe
x ./win-bridge.exe
x ./host-local.exe
C:\Users\svc_cftint/.ssh/id_rsa.pub
Execute the below commands on the Linux control-plane node (123.123.123.123) to add this Windows node's public key to its authorized keys
touch ~/.ssh/authorized_keys
echo  >> ~/.ssh/authorized_keys
Please close this shell and open a new one to join this node to the cluster

Then the join part fails

PS D:\transit\sig-windows-tools-master\kubeadm> .\KubeCluster.ps1 -ConfigFile .\v1.16.0\Kubeclustervxlan.json -join
[DownloadFile] File D:\transit\sig-windows-tools-master\kubeadm\hns.psm1 already exists.
WARNING: The names of some imported commands from the module 'hns' include unapproved verbs that might make them less
discoverable. To find the commands with unapproved verbs, run the Import-Module command again with the Verbose
parameter. For a list of approved verbs, type Get-Verb.
Get-NetIPAddress : No matching MSFT_NetIPAddress objects found by CIM query for instances of the
ROOT/StandardCimv2/MSFT_NetIPAddress class on the  CIM server: SELECT * FROM MSFT_NetIPAddress  WHERE ((InterfaceAlias
LIKE 'vEthernet (Ethernet)')) AND ((AddressFamily = 2)). Verify query parameters and retry.
At D:\transit\sig-windows-tools-master\kubeadm\KubeClusterHelper.psm1:409 char:13
+     return (Get-NetIPAddress -InterfaceAlias "$InterfaceName" -Addres ...
+             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : ObjectNotFound: (MSFT_NetIPAddress:String) [Get-NetIPAddress], CimJobException
    + FullyQualifiedErrorId : CmdletizationQuery_NotFound,Get-NetIPAddress

Get-NetIPAddress : No matching MSFT_NetIPAddress objects found by CIM query for instances of the
ROOT/StandardCimv2/MSFT_NetIPAddress class on the  CIM server: SELECT * FROM MSFT_NetIPAddress  WHERE ((InterfaceAlias
LIKE 'vEthernet (Ethernet)')) AND ((AddressFamily = 2)). Verify query parameters and retry.
At D:\transit\sig-windows-tools-master\kubeadm\KubeClusterHelper.psm1:462 char:14
+ ...    $addr = (Get-NetIPAddress -InterfaceAlias "$InterfaceName" -Addres ...
+                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : ObjectNotFound: (MSFT_NetIPAddress:String) [Get-NetIPAddress], CimJobException
    + FullyQualifiedErrorId : CmdletizationQuery_NotFound,Get-NetIPAddress

ConvertTo-DecimalIP : Cannot bind argument to parameter 'IPAddress' because it is null.
At D:\transit\sig-windows-tools-master\kubeadm\KubeClusterHelper.psm1:465 char:40
+     $mgmtSubnet = (ConvertTo-DecimalIP $addr) -band (ConvertTo-Decima ...
+                                        ~~~~~
    + CategoryInfo          : InvalidData: (:) [ConvertTo-DecimalIP], ParameterBindingValidationException
    + FullyQualifiedErrorId : ParameterArgumentValidationErrorNullNotAllowed,ConvertTo-DecimalIP

ConvertTo-MaskLength : Cannot bind argument to parameter 'SubnetMask' because it is null.
At D:\transit\sig-windows-tools-master\kubeadm\KubeClusterHelper.psm1:467 char:48
+     return "$mgmtSubnet/$(ConvertTo-MaskLength $mask)"
+                                                ~~~~~
    + CategoryInfo          : InvalidData: (:) [ConvertTo-MaskLength], ParameterBindingValidationException
    + FullyQualifiedErrorId : ParameterArgumentValidationErrorNullNotAllowed,ConvertTo-MaskLength

############################################
User Input
Destination       : D:\Kubernetes
Master            : 123.123.123.123
InterfaceName     : Ethernet
Cri               : dockerd
Cni               : flannel
NetworkPlugin     : vxlan
Release           : 1.16.3
MasterIp          : 123.123.123.123
ManagementIp      :
ManagementSubnet  : 0.0.0.0/
############################################
Downloading Kubeconfig from 10.179.17.71:~/.kube/config to D:\Kubernetes\config
config                                                                                100% 5448     5.3KB/s   00:00
Trying to connect to the Kubernetes control-plane node
############################################
Able to connect to the control-plane node
Discovered the following
Cluster CIDR    : 10.41.0.0/16
Service CIDR    : 10.96.0.0/12
DNS ServiceIp   : 10.96.0.10
############################################
InstallKubelet : Cannot bind argument to parameter 'NodeIp' because it is an empty string.
At D:\transit\sig-windows-tools-master\kubeadm\KubeCluster.ps1:331 char:17
+         -NodeIp $Global:ManagementIp -KubeletFeatureGates $KubeletFea ...
+                 ~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidData: (:) [InstallKubelet], ParameterBindingValidationException
    + FullyQualifiedErrorId : ParameterArgumentValidationErrorEmptyStringNotAllowed,InstallKubelet

InstallFlannelD : Cannot bind argument to parameter 'InterfaceIpAddress' because it is an empty string.
At D:\transit\sig-windows-tools-master\kubeadm\KubeClusterHelper.psm1:1275 char:78
+ ... annelD -Destination $Global:BaseDir -InterfaceIpAddress $ManagementIp
+                                                             ~~~~~~~~~~~~~
    + CategoryInfo          : InvalidData: (:) [InstallFlannelD], ParameterBindingValidationException
    + FullyQualifiedErrorId : ParameterArgumentValidationErrorEmptyStringNotAllowed,InstallFlannelD

Generated CNI Config [{
    "cniVersion":  "0.2.0",
    "name":  "vxlan0",
    "type":  "flannel",
    "capabilities":  {
                         "dns":  true
                     },
    "delegate":  {
                     "type":  "win-overlay",
                     "Policies":  [
                                      {
                                          "Name":  "EndpointPolicy",
                                          "Value":  {
                                                        "Type":  "OutBoundNAT",
                                                        "ExceptionList":  [
                                                                              "10.41.0.0/16",
                                                                              "10.96.0.0/12"
                                                                          ]
                                                    }
                                      },
                                      {
                                          "Name":  "EndpointPolicy",
                                          "Value":  {
                                                        "Type":  "ROUTE",
                                                        "DestinationPrefix":  "10.96.0.0/12",
                                                        "NeedEncap":  true
                                                    }
                                      }
                                  ]
                 }
}]
Generated net-conf Config [{
    "Network":  "10.41.0.0/16",
    "Backend":  {
                    "name":  "vxlan0",
                    "type":  "vxlan"
                }
}]
FlannelD service not installed
At D:\transit\sig-windows-tools-master\kubeadm\KubeClusterHelper.psm1:363 char:9
+         throw "FlannelD service not installed"
+         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : OperationStopped: (FlannelD service not installed:String) [], RuntimeException
    + FullyQualifiedErrorId : FlannelD service not installed

The join fails to create the Kubelet service because it can get the ManagementIP adress.

I'm a bit confused by that part... which IP is it supposed to be? I'm do not see it in Get-NetIPAddress for interface Ethernet... I tried hard code the IP address of that interface but still got error ( I really suck at powershell and windows :( )

Any idea would be very welcome !! Thanks !!!

HugoRh commented 4 years ago

Hi,

I decided to try the Microsoft way: https://docs.microsoft.com/en-us/virtualization/windowscontainers/kubernetes/joining-windows-workers?tabs=ManagementIP

It's also interesting has it is less automation and more explanation.

As I said earlier I'm far from being good with Windows or Powershell :)

So, my issue happens because Get-NetIPAddress does not get all expected info, it does impact the creation of the HNSNetwork later ( which in turn prevent flanneld to be installed). All because my VM has multiple network interface (called Ethernet or "vEthernet (nat) , etc...).

Yet KubeClusterHelper.psm1 assumes $InterfaceName = "Ethernet" So for example below code fails:

function Get-InterfaceIpAddress()
{
    Param (
        [Parameter(Mandatory=$false)] [String] $InterfaceName = "Ethernet"
    )
    return (Get-NetIPAddress -InterfaceAlias "$InterfaceName" -AddressFamily IPv4).IPAddress
}

I propose to implement a parameter of the script or ask the user the question of which interface is to be used:

(Get-NetIPAddress).InterfaceAlias
Loopback Pseudo-Interface 1
vEthernet (Ethernet) 22
vEthernet (nat)
Loopback Pseudo-Interface 1

With the correct name, correct IP will be catched:

PS C:\Users\user> function Get-InterfaceIpAddress()
>> {
>>     Param (
>>         [Parameter(Mandatory=$false)] [String] $InterfaceName = "vEthernet (Ethernet) 22"
>>     )
>>     return (Get-NetIPAddress -InterfaceAlias "$InterfaceName" -AddressFamily IPv4).IPAddress
>> }
PS C:\Users\user> Get-InterfaceIpAddress
123.123.123.123
fejta-bot commented 4 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

benmoss commented 4 years ago

As part of the move to the new kubeadm on Windows approach with 1.18 we are closing issues related to the previous alpha. /close

k8s-ci-robot commented 4 years ago

@benmoss: Closing this issue.

In response to [this](https://github.com/kubernetes-sigs/sig-windows-tools/issues/32#issuecomment-599602489): >As part of the move to the new kubeadm on Windows approach with 1.18 we are closing issues related to the previous alpha. >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.