Closed obitech closed 2 years ago
Hi! Sorry for the delay in getting back to you. I recommend making the following changes:
nodeGroups:
- name: name
minSize: 0
maxSize: 1
instanceType: r6g.large
iam:
withAddonPolicies:
autoScaler: true // adds the tags required for the Cluster Autoscaler to scale the nodegroup(s)
taints:
- key: "node.cilium.io/agent-not-ready"
value: "true"
effect: "NoSchedule"
- key: node-role.exaring.net/workload-type // removed the label that was repeating this
value: mem-intensive
effect: NoSchedule
- key: arch
value: arm64
effect: NoSchedule
propagateASGTags: true // propagates taints into ASG tags
availabilityZones:
- eu-central-1a
Please let me know if this solved the issue for you :)
I have a PR open to improve the docs around this that should be out soon.
Unfortunately the issue persists 😞 I forgot to mention it's a managed nodegroup, might that be the issue?
Yes, propagateASGTags
has a slightly different behaviour for managed nodegroups (we have an open ticket to unify the behaviour for managed and unmanaged nodegroups). Currently, with propagateASGTags
set to true
, the labels and taints of managed nodegroup are not converted to nodegroup tags so they have to be added manually, like what you were doing before:
managedNodeGroups:
- name: name
minSize: 0
maxSize: 1
instanceType: r6g.large
iam:
withAddonPolicies:
autoScaler: true // adds the tags required for the Cluster Autoscaler to scale the nodegroup(s)
taints:
- key: "node.cilium.io/agent-not-ready"
value: "true"
effect: "NoSchedule"
- key: node-role.exaring.net/workload-type // removed the label that was repeating this
value: mem-intensive
effect: NoSchedule
- key: arch
value: arm64
effect: NoSchedule
tags:
tags:
k8s.io/cluster-autoscaler/node-template/label/role: worker
k8s.io/cluster-autoscaler/node-template/label/node-role.exaring.net/workload-type: mem-intensive
k8s.io/cluster-autoscaler/node-template/taint/node.cilium.io/agent-not-ready: true:NoSchedule
k8s.io/cluster-autoscaler/node-template/taint/node.cilium.io/node-role.exaring.net/workload-type: mem-intensive:NoSchedule
k8s.io/cluster-autoscaler/node-template/taint/arch: arm64:NoSchedule
propagateASGTags: true // propagates taints into ASG tags
availabilityZones:
- eu-central-1a
The important part here is to add propagateASGTags
to propagate nodegroup tags into ASG tags so that the Auto Scaling Group can pick up the nodegroups. :)
We recently updated the docs to reflect all of this in a better way: https://eksctl.io/usage/autoscaling/
This should hopefully solve the issue, please let us know if it did or not!
That did in fact work! Thank you for your help @nikimanoledaki 😌
I'm having trouble getting the cluster-autoscaler to work with a node group of minSize: 0. I’ve followed the eksctl docs on the topic and set my labels and taints as tags on the nodeGroup definition:
Checking the node group in the AWS console, the tags and taints seem to be correctly set on the node group itself. However, the autoscaler logs state:
Has anyone experienced this before? How can I debug this further?