Azure / acs-engine

WE HAVE MOVED: Please join us at Azure/aks-engine!
https://github.com/Azure/aks-engine
MIT License
1.03k stars 560 forks source link

Node labels don't get applied to Windows nodes #3266

Closed bnookala closed 6 years ago

bnookala commented 6 years ago

Is this a request for help?:

No


Is this an ISSUE or FEATURE REQUEST? (choose one):

Issue


What version of acs-engine?:

acs-engine-v0.18.1-darwin-amd64, but I suspect all.


Orchestrator and version (e.g. Kubernetes, DC/OS, Swarm)

Kubernetes

What happened:

Deploy a hybrid cluster with custom node labels on linux pools and windows pools, noticed that the custom node labels don't get applied to the windows nodes, and is in fact, missing many node labels altogether.

Linux pool:

labels:
      agentpool: linuxpool
      beta.kubernetes.io/arch: amd64
      beta.kubernetes.io/instance-type: Standard_DS2_v2
      beta.kubernetes.io/os: linux
      failure-domain.beta.kubernetes.io/region: australiaeast
      failure-domain.beta.kubernetes.io/zone: "0"
      kubernetes.azure.com/cluster: wow-dev-k8sHack-aae2
      kubernetes.io/hostname: k8s-linuxpool-20427215-0
      kubernetes.io/role: agent
      storageprofile: managed
      storagetier: Premium_LRS

Windows pool:

labels:
      beta.kubernetes.io/arch: amd64
      beta.kubernetes.io/instance-type: Standard_DS2_v2
      beta.kubernetes.io/os: windows
      failure-domain.beta.kubernetes.io/region: australiaeast
      failure-domain.beta.kubernetes.io/zone: "0"
      kubernetes.io/hostname: 20427k8s9010

What you expected to happen:

The custom node labels should apply to the windows nodes.

How to reproduce it (as minimally and precisely as possible):

Add custom labels to a windows agent pool (or don't - it doesn't appear to add them regardless):

    "agentPoolProfiles": [
      {
        "name": "windowsPooll",
        "count": 3,
        "vmSize": "Standard_DS2_v2",
        "availabilityProfile": "AvailabilitySet",
        "osType": "Windows"
        "customNodeLabels": {
          "bar": "lulz",
          "baz": "foo"
        }
      }
    ],

Anything else we need to know:

Seems like this is the problematic line: https://github.com/Azure/acs-engine/blob/d9bd63bf0592fbed9dd9b2cf40b9725bc8b9e0c0/parts/k8s/kuberneteswindowssetup.ps1#L47

CecileRobertMichon commented 6 years ago

This looks like a regression, https://github.com/Azure/acs-engine/pull/2770 should have fixed this a while ago. @JamesEarle if you have time could you take a look?

bnookala commented 6 years ago

@CecileRobertMichon @JamesEarle I picked through the setup script and found that if the network plugin is just "azure" (which is the default), it won't add the node labels to the kubelet's arguments, and thus the kubelet is invoked without them. There may also be an issue here regarding other mismatched kubelet arguments between KubeletCommandLine and KubeletArgList, but I haven't confirmed that yet.

The resolution to this issue would be to add the node labels to KubeletCommandLine:

https://github.com/Azure/acs-engine/blob/d9bd63bf0592fbed9dd9b2cf40b9725bc8b9e0c0/parts/k8s/kuberneteswindowssetup.ps1#L312-L314

I can take this on and open a PR.

PatrickLang commented 6 years ago

From my testing of #3753

$ kubectl describe node 30823k8s9000
Name:               30823k8s9000
Roles:              agent
Labels:             agentpool=windowspool
                    beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/instance-type=Standard_D2_v3
                    beta.kubernetes.io/os=windows
                    failure-domain.beta.kubernetes.io/region=westus2
                    failure-domain.beta.kubernetes.io/zone=0
                    kubernetes.azure.com/cluster=plang0830rg
                    kubernetes.io/hostname=30823k8s9000
                    kubernetes.io/role=agent
                    storageprofile=managed
                    storagetier=Standard_LRS
Annotations:        node.alpha.kubernetes.io/ttl=0
                    volumes.kubernetes.io/controller-managed-attach-detach=true
PatrickLang commented 6 years ago

Fixed, will be in next release :)