Open jodafm opened 4 years ago
It seems that there's nothing specifically preventing the Cluster type from having windows nodes - it's just that the NodeGroup type has linux stuff hard coded, most notably the user data for the launch config that is used to join the cluster (it's a hard coded bash script). It would be really useful if it were possible to override the launch config's user data and I think it would then be possible (not saying trivial though) to add a windows node group.
To support Windows node groups we'll have to refactor how the cluster and nodegroup classes function.
Specifically, we'll have to cue off a cluster toggle that windows is being used to prep the cluster and nodegroups accordingly, as an EKS cluster with Windows support must have both Linux and Windows node groups to function.
Per AWS:
openssl
and jq
available on the client machine to run the scripts:
kubectl
, and create a Secret from its contentskube-proxy
coredns
and the VPC resource controller.This looks accurate. I have been able to do all this using pulumi. The only thing that I had to do manually was to create the CSR and approve it. I then supplied the resulting key pair as secret config to my pulumi program. The following is what I did using pulumi:
ConfigFile
instancesCluster
instance
skipDefaultNodeGroup: true
roleMappings: [ { roleArn: linuxInstanceRole.apply((r) => r.arn), username: "system:node:{{EC2PrivateDNSName}}", groups: ["system:bootstrappers", "system:nodes"], }, { roleArn: windowsInstanceRole.apply((r) => r.arn), username: "system:node:{{EC2PrivateDNSName}}", groups: ["system:bootstrappers", "system:nodes", "eks:kube-proxy-windows"], }, ]
createNodeGroupSecurityGroup
from the eks packageWindowsNodeGroup
class that's a copy of the pulumi supplied NodeGroup
class except
NodeGroup
instance for the linux nodes and one WindowsNodeGroup
instance for the windows nodes. They are injected each with their own instance profile but the same security groupThis is definitely not ideal but it was required because there was no way for me to override the hard coded bash script for user data. Had I been able to inject the user data, I would have been able to make this work with pulumi out of the box. It is not a particularly nice experience, but I think it would be relatively easy to supply a nice API on top of the existing APIs to compose the required things for a mixed os cluster (e.g. a MixedOsCluster
class). The only "not-so-nice" requirement left would be having to create the CSR out of band and supply the key pair as config but I don't think that pulumi can approve CSRs (correct me if I'm wrong please) but it also takes the requirement off the client machine having to have openssl.
Thanks for the validation and detailed walkthrough @gunniwho, this insight is very helpful!
We've opened up #428 to track overriding the nodegroup userdata script as start to supporting Windows nodegroups.
This looks accurate. I have been able to do all this using pulumi. The only thing that I had to do manually was to create the CSR and approve it. I then supplied the resulting key pair as secret config to my pulumi program. The following is what I did using pulumi:
- created the required secret for the VPC admission webhook using the key pair config
- created the VPC resource controller and admission webhook by downloading the correct yaml files for my region from AWS, patching them and provisioning them using
ConfigFile
instances- created the required cluster role binding
- created roles and instance profiles for both my linux and windows node groups
created the
Cluster
instance
skipDefaultNodeGroup: true
roleMappings: [ { roleArn: linuxInstanceRole.apply((r) => r.arn), username: "system:node:{{EC2PrivateDNSName}}", groups: ["system:bootstrappers", "system:nodes"], }, { roleArn: windowsInstanceRole.apply((r) => r.arn), username: "system:node:{{EC2PrivateDNSName}}", groups: ["system:bootstrappers", "system:nodes", "eks:kube-proxy-windows"], }, ]
- created a node security group using
createNodeGroupSecurityGroup
from the eks packagecreated a new
WindowsNodeGroup
class that's a copy of the pulumi suppliedNodeGroup
class except
- it has the required powershell user data
- it requires the AMI to be supplied via args (no lookup)
- created one
NodeGroup
instance for the linux nodes and oneWindowsNodeGroup
instance for the windows nodes. They are injected each with their own instance profile but the same security groupThis is definitely not ideal but it was required because there was no way for me to override the hard coded bash script for user data. Had I been able to inject the user data, I would have been able to make this work with pulumi out of the box. It is not a particularly nice experience, but I think it would be relatively easy to supply a nice API on top of the existing APIs to compose the required things for a mixed os cluster (e.g. a
MixedOsCluster
class). The only "not-so-nice" requirement left would be having to create the CSR out of band and supply the key pair as config but I don't think that pulumi can approve CSRs (correct me if I'm wrong please) but it also takes the requirement off the client machine having to have openssl.
@gunniwho Thanks for detailing these steps. I kinda did the same but a little differently . I created a linux cluster via pulumi installed /prerequisites as per aws documentation ( certificates , secrets etc..) In my pulumi in the next apply I added a new windows nodegroup but with different userdata with powershell script .
But my windows nodes are not registering to control plane. My code can be found here https://github.com/bit-cloner/poke
would it be possible to share your pulumi code ?
Thank you
Hi, Do you plan to support Node Groups using a "windows server" AMI family?