Closed mikestef9 closed 3 years ago
When this was raised in #585, #507 was tagged as an existing request for this feature, but I think that was confusion... #507 seems to be about Container Insights correctly monitoring tainted nodes, while what we want here (and in #585) is to support setting the taints on Managed Nodegroups as part of a rollout, e.g. with eksctl.
The comment in #585 had nine thumbs-up, on top of the three currently here.
@TBBle correct, I wanted to open a separate issue to explicitly track tainting node groups through the EKS API
@mikestef9 we would like to see "tainting node groups through the EKS API" progressing and bumped it from #12 👍 to #37 as of now.
It looks like the bootstrap script used by eks nodes already support taints. My understanding is that it would be a small feature to implement because it would only require to modify the userdata in the launch template to add extra args, just like its done for labels currently.
We would love to have this!
"When nodes are created dynamically by the Kubernetes autoscaler, they need to be created with the proper taint and label. With EKS, the taint and label can be specified in the Kubernetes kubelet service defined in the UserData section of the AWS autoscaling group LaunchConfiguration."
https://docs.cloudbees.com/docs/cloudbees-ci/latest/cloud-admin-guide/eks-auto-scaling-nodes
@jhcook-ag You can't specify the UserData for Managed Node Groups when you create them.
You can modify the UserData in the Launch Configuration in the AWS console after creation, but then the Managed Node Groups feature will refuse to touch your Launch Configuration again, and you're effectively now using unmanaged Node Groups, although eksctl will still try to use the Managed Node Groups API and fail.
@mhausenblas we really need this 👍
Absolutely would love the idea.
It is a must-have feature!
👍
This is a must-have feature for us as well. We can't use managed node groups because of this. When would you expect this to be released? (just roughly) 👍
Hi @martinoravsky, I believe this feature is available now.
We did it by customizing the userdata on the custom launch template and specifying the taints for the kubelet (using the register-with-taints argument).
Hi @Dudssource ,
are you using custom AMIs? I'm using launch templates with EKS optimized AMIs which include UserData that bootstraps the node to the cluster automatically (with --kubelet-extra-args empty). This userdata is not editable for us, we can only add our own UserData as MIME multipart file which has no effect on bootstrapping the cluster. I'm curious if you were able to get this to work without custom AMIs.
@martinoravsky, yes unfortunately we had to use a custom AMI for this to work. But we used the same optimized AMI that EKS uses, we use terraform so we used a datasource to get the latest AMI for our cluster version. I know that this is possible with Cloudformation and parameter store too.
The approach that @Dudssource used here is certainly an option, but we do plan to add taints directly to the EKS API (similar to labels), so that a custom AMI is not required.
I've found a solution (admittedly quite hackish) to allow setting taints with the offical AMIs:
Set the userdata for the Launch Template similar to this:
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="==7561478f-5b81-4e9d-9db6-aec8f463d2ab=="
--==7561478f-5b81-4e9d-9db6-aec8f463d2ab==
Content-Type: text/x-shellscript; charset="us-ascii"
#!/bin/bash
sed -i '/^KUBELET_EXTRA_ARGS=/a KUBELET_EXTRA_ARGS+=" --register-with-taints=foo=bar:NoSchedule"' /etc/eks/bootstrap.sh
--==7561478f-5b81-4e9d-9db6-aec8f463d2ab==--\
This script is run before the bootstrap script, which is managed by EKS, patching the /etc/eks/bootstrap.sh
to inject the necessary --register-with-taints
in the KUBELET_EXTRA_ARGS
variable.
This solution is not perfect and might break if AWS changes the bootstrap script, but it works for now and can be used until there is proper support for taints.
@lwimmer That is superbly hacky! Good work.
I'm really surprised this feature is missing, and overall I'm shocked how feature incomplete node groups are.
+1
+1
Thanks @lwimmer , I tried to implement your logic here in the official terraform module: https://github.com/terraform-aws-modules/terraform-aws-eks/pull/1138
Thanks @lwimmer. I made it based on your solution in terraform. This is my full node group with custom taints:
resource "aws_eks_node_group" "some_nodegroup" {
node_group_name = "some_nodegroup"
cluster_name = aws_eks_cluster.eks_cluster.name
node_role_arn = aws_iam_role.eks_nodegroup_role.arn
subnet_ids = aws_subnet.public_subnet.*.id
instance_types = [
...,
]
scaling_config {
desired_size = ...
max_size = ...
min_size = ...
}
launch_template {
id = aws_launch_template.some_launch_template.id
version = aws_launch_template.some_launch_template.latest_version
}
labels = {
type = "some-label"
}
depends_on = [
aws_iam_role_policy_attachment.iam-eks-nodegroup-AmazonEKSWorkerNodePolicy,
aws_iam_role_policy_attachment.iam-eks-nodegroup-AmazonEKS_CNI_Policy,
aws_iam_role_policy_attachment.iam-eks-nodegroup-AmazonEC2ContainerRegistryReadOnly,
]
lifecycle {
ignore_changes = [scaling_config[0].desired_size]
}
}
resource "aws_launch_template" "some_launch_template" {
name = "some_lauch_template"
block_device_mappings {
device_name = "/dev/xvda"
ebs {
volume_size = 20
volume_type = "gp2"
}
}
user_data = base64encode(<<-EOF
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="==7561478f-5b81-4e9d-9db6-aec8f463d2ab=="
--==7561478f-5b81-4e9d-9db6-aec8f463d2ab==
Content-Type: text/x-shellscript; charset="us-ascii"
#!/bin/bash
sed -i '/^KUBELET_EXTRA_ARGS=/a KUBELET_EXTRA_ARGS+=" --register-with-taints=some_taint1=true:NoSchedule,some_taint2=true:NoSchedule"' /etc/eks/bootstrap.sh
--==7561478f-5b81-4e9d-9db6-aec8f463d2ab==--\
EOF
)
tag_specifications {
resource_type = "instance"
tags = {
Name = "..."
}
}
}
For multiple taints:
<key1>=<value1>:<effect1>,<key2>=<value2>:<effect2>
I have found that the preBootstrapCommands
parameter for managed node groups is a much easier way to add a taint than using user data.
preBootstrapCommands:
- sed -i '/^KUBELET_EXTRA_ARGS=/a KUBELET_EXTRA_ARGS+=" --register-with-taints=<key1>=<value1>:<effect1>"' /etc/eks/bootstrap.sh
I have found that the preBootstrapCommands parameter for managed node groups is a much easier way to add a taint than using user data
preBootstrapCommands
are specific to eksctl, not everyone is using eksctl (see the Terraform solutions here)
eksctl implements preBootstrapCommands
by populating user data in the launch template, exactly as is being done in the examples here.
With user-data I ended up doing this . In my case I wanted to switch to systemd
# Systemd does not support appending environment variables. Add a new variable
sed -i 's/KUBELET_EXTRA_ARGS/KUBELET_EXTRA_ARGS $EXTENDED_KUBELET_ARGS/' /etc/systemd/system/kubelet.service
cat << EOF > /etc/systemd/system/kubelet.service.d/9999-extended-kubelet-args.conf
[Service]
Environment='EXTENDED_KUBELET_ARGS=--cgroup-driver=systemd'
EOF
systemctl daemon-reload
Hi @andre-lx,
I have almost the same code as you, but I cannot apply my launch config:
error creating EKS Node Group: InvalidParameterException: Remote access configuration cannot be specified with a launch template.
I tried with and without key_name
in my launch_config, but it's still not working. Any idea?
@josephprem
With user-data I ended up doing this . In my case I wanted to switch to systemd
If you edit your comment to put a line with ``` (that's three back-ticks) before and after your user-data, it won't parse it as markdown. That will also fix GitHub turning the comment line into a top-level heading, i.e. it will look like this, which I imagine is what you intended?
# Systemd does not support appending environment variables. Add a new variable
sed -i 's/KUBELET_EXTRA_ARGS/KUBELET_EXTRA_ARGS $EXTENDED_KUBELET_ARGS/' /etc/systemd/system/kubelet.service
cat << EOF > /etc/systemd/system/kubelet.service.d/9999-extended-kubelet-args.conf
[Service]
Environment='EXTENDED_KUBELET_ARGS=--cgroup-driver=systemd'
EOF
systemctl daemon-reload
I reckon solutions presented here are not "managed services like". Since I pay 0,10 USD per hour of running clusters, I would expect not to mess around with the kubelet config.
I as a customer would like to add a flag to my node_group that eks understand that all new nodes (and current ones) have taints applyed automatically.
I as a customer would like to use terraform to provision the taint capability.
anothe comment, this issue is the second most upvoted request in the container roadmap. I hope aws is not spending time in some "useless" features I see on the roadmap
@EvertonSA I am after this since about two years now, and I have solved it two ways for my customers:
We have settled on option 1, so you will be able to specify zero as a minimum when creating a node group. For an update on implementation, we have decided to do the work in Cluster Autoscaler itself to pull labels, taints, and extended resources directly from the managed node groups API, rather than tagging the underlying ASG. You can follow the progress here
@mikestef9 as #724 is coming to a conclusion on managed node groups scaling and mentioning taints, will this topic come to a conclusion too?
@lwimmer this stopped working for me some since yesterday. Any new group or new node in a group that uses launch template is left in a 'Not ready' state with network not initialized. Groups without launch templates are fine. Any idea ?
@EvertonSA 🥳
Guys,
I found the same implementation on AWS terraform workshop https://github.com/aws-samples/terraform-eks-code/blob/master/extra/nodeg2/user_data.tf
Having say that I am hoping to simple pass in a simple flag in terraform param and not the twist and turn way and get it done and worry about when next release, this feature doesn't work again, just like the eksctl prebootstrap.
I don't know how difficult will be "if define" this and plug this value in just like the "helm chart" to be implemented on terraform or eksctl . We, community can twist and turn to provide a solution. But overall I think all these reasonable production ready feature shall be inside allow all of us to extend the functionality properly. These has to be answered by AWS and committed, if in next release it is wipe out again, why should anyone waste time doing it?
Cheers.
Hi, this has been merged and it seems to still work with official AMI.
I have just tested with the following configuration:
node_groups = {
"default-${local.aws_region}a" = {
ami_type = "AL2_ARM_64"
desired_capacity = 1
max_capacity = 3
min_capacity = 1
instance_types = ["t4g.large"]
subnets = [dependency.vpc.outputs.private_subnets[0]]
disk_size = 20
}
"default-${local.aws_region}b" = {
ami_type = "AL2_ARM_64"
desired_capacity = 1
max_capacity = 3
min_capacity = 1
instance_types = ["t4g.large"]
subnets = [dependency.vpc.outputs.private_subnets[1]]
disk_size = 20
}
"default-${local.aws_region}c" = {
ami_type = "AL2_ARM_64"
create_launch_template = true
desired_capacity = 1
max_capacity = 3
min_capacity = 1
instance_types = ["t4g.large"]
subnets = [dependency.vpc.outputs.private_subnets[2]]
kubelet_extra_args = "--node-labels=role=private --register-with-taints=dedicated=private:NoSchedule"
disk_size = 20
}
"taint-${local.aws_region}c" = {
create_launch_template = true
desired_capacity = 1
max_capacity = 3
min_capacity = 1
instance_types = ["t3a.large"]
subnets = [dependency.vpc.outputs.private_subnets[2]]
kubelet_extra_args = "--node-labels=role=private --register-with-taints=dedicated=private:NoSchedule"
disk_size = 20
}
}
And it is working as expected, this PR is based on the fix found here
Hi, this has been merged and it seems to still work with official AMI.
I have just tested with the following configuration:
node_groups = { "default-${local.aws_region}a" = { ami_type = "AL2_ARM_64" desired_capacity = 1 max_capacity = 3 min_capacity = 1 instance_types = ["t4g.large"] subnets = [dependency.vpc.outputs.private_subnets[0]] disk_size = 20 } "default-${local.aws_region}b" = { ami_type = "AL2_ARM_64" desired_capacity = 1 max_capacity = 3 min_capacity = 1 instance_types = ["t4g.large"] subnets = [dependency.vpc.outputs.private_subnets[1]] disk_size = 20 } "default-${local.aws_region}c" = { ami_type = "AL2_ARM_64" create_launch_template = true desired_capacity = 1 max_capacity = 3 min_capacity = 1 instance_types = ["t4g.large"] subnets = [dependency.vpc.outputs.private_subnets[2]] kubelet_extra_args = "--node-labels=role=private --register-with-taints=dedicated=private:NoSchedule" disk_size = 20 } "taint-${local.aws_region}c" = { create_launch_template = true desired_capacity = 1 max_capacity = 3 min_capacity = 1 instance_types = ["t3a.large"] subnets = [dependency.vpc.outputs.private_subnets[2]] kubelet_extra_args = "--node-labels=role=private --register-with-taints=dedicated=private:NoSchedule" disk_size = 20 } }
And it is working as expected, this PR is based on the fix found here
Still not working for me, even with create_launch_template
set to true
:(
Hi, this has been merged and it seems to still work with official AMI.
I have just tested with the following configuration:
node_groups = { "default-${local.aws_region}a" = { ami_type = "AL2_ARM_64" desired_capacity = 1 max_capacity = 3 min_capacity = 1 instance_types = ["t4g.large"] subnets = [dependency.vpc.outputs.private_subnets[0]] disk_size = 20 } "default-${local.aws_region}b" = { ami_type = "AL2_ARM_64" desired_capacity = 1 max_capacity = 3 min_capacity = 1 instance_types = ["t4g.large"] subnets = [dependency.vpc.outputs.private_subnets[1]] disk_size = 20 } "default-${local.aws_region}c" = { ami_type = "AL2_ARM_64" create_launch_template = true desired_capacity = 1 max_capacity = 3 min_capacity = 1 instance_types = ["t4g.large"] subnets = [dependency.vpc.outputs.private_subnets[2]] kubelet_extra_args = "--node-labels=role=private --register-with-taints=dedicated=private:NoSchedule" disk_size = 20 } "taint-${local.aws_region}c" = { create_launch_template = true desired_capacity = 1 max_capacity = 3 min_capacity = 1 instance_types = ["t3a.large"] subnets = [dependency.vpc.outputs.private_subnets[2]] kubelet_extra_args = "--node-labels=role=private --register-with-taints=dedicated=private:NoSchedule" disk_size = 20 } }
And it is working as expected, this PR is based on the fix found here
Thanks for the PR, I will try it out. Your solution is very elegant. Unfortunately, some of our already running environments are provisioned using rawly terraform resources. I have no clue how much effort would it take to migrate to use the terraform aws eks module. I might give it a shot on our development environments in the next following weeks.
Although I strongly support your development, I still think taints should be accepted here: https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/eks_node_group. What happens if I open a ticket with mr. Bezos and he replies that they will go further with the incident because I'm using a community module (that modifies the kubelet default behavior) instead of the official product API?
I really don't know what are the implications of using community modules, but according to the documentation, only Enterprise Business Support can have "Third-party software support". So #AWS, if I encorage my team to fully migrate to this module, will I have issues with your definition of "Third-party software support"? Does "Third-party software support" includes kubelet default behavior modifications?
We eagerly wait for a response.
I could not find a definition for "Third-party software support" other than:
Third-party software support – Help with Amazon Elastic Compute Cloud (Amazon EC2) instance operating systems and configuration. Also, help with the performance of the most popular third-party software components on AWS. Third-party software support isn't available for customers on Basic or Developer Support plans.
Hello guys,
I have been working on these for couple of months (with or without terraform). It will not work no matter how hard you tried. That's a problem on EKS manage nodegroup it will plug their own user-data behind your user-data. AWS created another secondary launchtemplate on your behave and the user-data from running instance is getting from the new the new launchtemplate.
You can verify this on your EC2 node then you will know what am i talking about. And you can compare your launch template on the AWS console and on the running instance launch template.
# ssh [your_eks_node]
$ curl http://169.254.169.254/latest/user-data
You can also manually view the launchtemplate (sorted by recent date)
You can still do it, but the status of the nodegroup creation is "NodeCreationFailure" after waited 20 minutes for each try.
Cheers, Cheng Lim
Hi, this has been merged and it seems to still work with official AMI. I have just tested with the following configuration:
node_groups = { "default-${local.aws_region}a" = { ami_type = "AL2_ARM_64" desired_capacity = 1 max_capacity = 3 min_capacity = 1 instance_types = ["t4g.large"] subnets = [dependency.vpc.outputs.private_subnets[0]] disk_size = 20 } "default-${local.aws_region}b" = { ami_type = "AL2_ARM_64" desired_capacity = 1 max_capacity = 3 min_capacity = 1 instance_types = ["t4g.large"] subnets = [dependency.vpc.outputs.private_subnets[1]] disk_size = 20 } "default-${local.aws_region}c" = { ami_type = "AL2_ARM_64" create_launch_template = true desired_capacity = 1 max_capacity = 3 min_capacity = 1 instance_types = ["t4g.large"] subnets = [dependency.vpc.outputs.private_subnets[2]] kubelet_extra_args = "--node-labels=role=private --register-with-taints=dedicated=private:NoSchedule" disk_size = 20 } "taint-${local.aws_region}c" = { create_launch_template = true desired_capacity = 1 max_capacity = 3 min_capacity = 1 instance_types = ["t3a.large"] subnets = [dependency.vpc.outputs.private_subnets[2]] kubelet_extra_args = "--node-labels=role=private --register-with-taints=dedicated=private:NoSchedule" disk_size = 20 } }
And it is working as expected, this PR is based on the fix found here
Thanks for the PR, I will try it out. Your solution is very elegant. Unfortunately, some of our already running environments are provisioned using rawly terraform resources. I have no clue how much effort would it take to migrate to use the terraform aws eks module. I might give it a shot on our development environments in the next following weeks.
Although I strongly support your development, I still think taints should be accepted here: https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/eks_node_group. What happens if I open a ticket with mr. Bezos and he replies that they will go further with the incident because I'm using a community module (that modifies the kubelet default behavior) instead of the official product API?
I really don't know what are the implications of using community modules, but according to the documentation, only Enterprise Business Support can have "Third-party software support". So #AWS, if I encorage my team to fully migrate to this module, will I have issues with your definition of "Third-party software support"? Does "Third-party software support" includes kubelet default behavior modifications?
We eagerly wait for a response.
I honestly do not know about the support but using custom launch template is supposed to be supported on AWS, if you have support and are using the official AMI, I do not see why you would loose the support. I guess the same thing could apply to people using custom AMI that AWS have not way to verified, do they also loose support ?
@teochenglim Not sure what you are referring to, but providing user data in managed nodegroup launch template works fine and is merged in the EKS created launch template.
edit What I meant to say is yes there is a known working workaround, but this issue is for support in AWS API, not support forum for Terraform etc. I also missed the status Coming Soon of this issue :+1:
~The fact that this can be used as workaround to add taints by modifying eks bootstrap should in my view obviously not be considered solution. I don't know how this is even missing in EKS when taints has been Kubernetes feature since long time ago. In Azure managed node pools it has pretty much always been supported~
Hi ArchiFleKs,
My 2 cents, if everything need to custom, why EKS? We should run on perm Kubernetes.
Yes, that's an option that custom launch template is supported on AWS now. But is has bug.
And to be fair people just mixing around everything now. Some are talking about terraform module, some are talking about eksctl, some are talking about custom workgroup or manage workgroup. And you are talking about office AMI?
But base on my simple troubleshooting, an extra launchtemplate is created and your manage node group is pointing to that. This behaviour is the same for terraform or manually do it on AWS console. I am yet to try eksctl but why should i try it since i am no longer using it?
@teochenglim Not sure what you are referring to, but providing user data in managed nodegroup launch template works fine and is merged in the EKS created launch template.
The fact that this can be used as workaround to add taints by modifying eks bootstrap should in my view obviously not be considered solution. I don't know how this is even missing in EKS when taints has been Kubernetes feature since long time ago. In Azure managed node pools it has pretty much always been supported:
az aks nodepool add \ --resource-group myResourceGroup \ --cluster-name myAKSCluster \ --name taintnp \ --node-count 1 \ --node-taints sku=gpu:NoSchedule \
I try it today and it doesn't work for me. Can you show me your working version? This is EKS, why you showing aks? BTW we are creating EKS using terraform, we can't add nodegroup using eksctl
Given this has gone from "We're Working On It" to "Coming Soon", presumably it's mostly done, and is being tested/validated/integrated, so "AWS sucks, everyone else has had this forever" isn't really a useful contribution.
Workarounds in the meantime are a useful contribution, I think, but support questions about them does generate a bit of noise in this ticket. Is there a terraform-specific place to debug the terraform-based workaround instead, so this ticket can remain focussed on the Managed Node Groups API for this, and maybe just catalog the workarounds (all using custom launch templates now?).
If custom launch templates aren't working correctly, that's not really a "here" thing either. #585 would be closer, but this isn't really a support forum anyway, so you may not have much luck there.
Hi ArchiFleKs,
My 2 cents, if everything need to custom, why EKS? We should run on perm Kubernetes.
Yes, that's an option that custom launch template is supported on AWS now. But is has bug.
And to be fair people just mixing around everything now. Some are talking about terraform module, some are talking about eksctl, some are talking about custom workgroup or manage workgroup. And you are talking about office AMI?
But base on my simple troubleshooting, an extra launchtemplate is created and your manage node group is pointing to that. This behaviour is the same for terraform or manually do it on AWS console. I am yet to try eksctl but why should i try it since i am no longer using it?
You still need tools to orchestrate your infrastructure, whether it is managed or not. Even if you do it by hand with the AWS console or the awscli, Cloudformation, Terraform or eksctl
I agree that AWS EKS managed node group API should expose a native Taint options like it does for the labels. Exposing the kubelet args allow people to customize kubelet as they wish. This allow power user to do custom configuration even with managed node group.
Even if using managed service, you still need to use an AMI (by official I mean this one https://docs.aws.amazon.com/eks/latest/userguide/eks-optimized-ami.html) or you can build your own.
The behavior if building your own is different than th official one if using user data as explained here. (https://docs.aws.amazon.com/eks/latest/userguide/launch-templates.html#launch-template-user-data).There is a merged involved with official AMI that you do not have when using official AMI (that prevent the pre userdata of being used).
If you can explain your bug in more detail maybe someone here can help. We are trying to build tools (eksctl or terraform-aws-eks) to abstract this part for the user (just like a manage service does).
Personally I'm using the terraform-aws-eks module and this feature has just been release and is working at least with official AMI, I have not tested with custom AMI.
Let me know if I can help you with this.
Hi, this has been merged and it seems to still work with official AMI. I have just tested with the following configuration:
node_groups = { "default-${local.aws_region}a" = { ami_type = "AL2_ARM_64" desired_capacity = 1 max_capacity = 3 min_capacity = 1 instance_types = ["t4g.large"] subnets = [dependency.vpc.outputs.private_subnets[0]] disk_size = 20 } "default-${local.aws_region}b" = { ami_type = "AL2_ARM_64" desired_capacity = 1 max_capacity = 3 min_capacity = 1 instance_types = ["t4g.large"] subnets = [dependency.vpc.outputs.private_subnets[1]] disk_size = 20 } "default-${local.aws_region}c" = { ami_type = "AL2_ARM_64" create_launch_template = true desired_capacity = 1 max_capacity = 3 min_capacity = 1 instance_types = ["t4g.large"] subnets = [dependency.vpc.outputs.private_subnets[2]] kubelet_extra_args = "--node-labels=role=private --register-with-taints=dedicated=private:NoSchedule" disk_size = 20 } "taint-${local.aws_region}c" = { create_launch_template = true desired_capacity = 1 max_capacity = 3 min_capacity = 1 instance_types = ["t3a.large"] subnets = [dependency.vpc.outputs.private_subnets[2]] kubelet_extra_args = "--node-labels=role=private --register-with-taints=dedicated=private:NoSchedule" disk_size = 20 } }
And it is working as expected, this PR is based on the fix found here
Still not working for me, even with
create_launch_template
set totrue
:(
Are you using the master version of the module ? Latest release with this PR is only out today in https://github.com/terraform-aws-modules/terraform-aws-eks/releases/tag/v15.2.0
Are you using the master version of the module ? Latest release with this PR is only out today in https://github.com/terraform-aws-modules/terraform-aws-eks/releases/tag/v15.2.0
Oh, I thought it was included in version 15.1.0, I'll try with version 15.2.0 then, thanks! :)
Hi, this has been merged and it seems to still work with official AMI. I have just tested with the following configuration:
node_groups = { "default-${local.aws_region}a" = { ami_type = "AL2_ARM_64" desired_capacity = 1 max_capacity = 3 min_capacity = 1 instance_types = ["t4g.large"] subnets = [dependency.vpc.outputs.private_subnets[0]] disk_size = 20 } "default-${local.aws_region}b" = { ami_type = "AL2_ARM_64" desired_capacity = 1 max_capacity = 3 min_capacity = 1 instance_types = ["t4g.large"] subnets = [dependency.vpc.outputs.private_subnets[1]] disk_size = 20 } "default-${local.aws_region}c" = { ami_type = "AL2_ARM_64" create_launch_template = true desired_capacity = 1 max_capacity = 3 min_capacity = 1 instance_types = ["t4g.large"] subnets = [dependency.vpc.outputs.private_subnets[2]] kubelet_extra_args = "--node-labels=role=private --register-with-taints=dedicated=private:NoSchedule" disk_size = 20 } "taint-${local.aws_region}c" = { create_launch_template = true desired_capacity = 1 max_capacity = 3 min_capacity = 1 instance_types = ["t3a.large"] subnets = [dependency.vpc.outputs.private_subnets[2]] kubelet_extra_args = "--node-labels=role=private --register-with-taints=dedicated=private:NoSchedule" disk_size = 20 } }
And it is working as expected, this PR is based on the fix found here
Thanks for the PR, I will try it out. Your solution is very elegant. Unfortunately, some of our already running environments are provisioned using rawly terraform resources. I have no clue how much effort would it take to migrate to use the terraform aws eks module. I might give it a shot on our development environments in the next following weeks. Although I strongly support your development, I still think taints should be accepted here: https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/eks_node_group. What happens if I open a ticket with mr. Bezos and he replies that they will go further with the incident because I'm using a community module (that modifies the kubelet default behavior) instead of the official product API? I really don't know what are the implications of using community modules, but according to the documentation, only Enterprise Business Support can have "Third-party software support". So #AWS, if I encorage my team to fully migrate to this module, will I have issues with your definition of "Third-party software support"? Does "Third-party software support" includes kubelet default behavior modifications? We eagerly wait for a response.
I honestly do not know about the support but using custom launch template is supposed to be supported on AWS, if you have support and are using the official AMI, I do not see why you would loose the support. I guess the same thing could apply to people using custom AMI that AWS have not way to verified, do they also loose support ?
yes, I had my ticket dropped few years ago.
Some features took 15 days to change from Comming Soon to Shipped. Other features took months. How long should I wait? Does it make sense using community terraform workarunds if we are now on comming soon?
@TBBle "so "AWS sucks, everyone else has had this forever" isn't really a useful contribution.". Disagre totaly. As a product owner think this is REALLY useful contribution to my product.
How long should I wait? Does it make sense using community terraform workarunds if we are now on comming soon?
That depends on your needs and priorities. If you need a terraform deployment today, then you can't wait, so don't wait. If you are just tracking this as a blocker for migrating to Managed Node Groups, and are happy with self/un-managed Node Groups in the meantime, then waiting is fine. (I'm in the latter boat, but it's not the only "migration-blocking" feature I'm tracking, and really only applies to the "next cluster" I build, since existing clusters work now)
As for the other part, since you stripped the context of my quote, including the important part, I'll requote it
presumably it's mostly done, and is being tested/validated/integrated, so "AWS sucks, everyone else has had this forever" isn't really a useful contribution.
Leaving aside the toxic phrasing of this feedback, "AWS sucks, everyone else has had this forever" tells a Product Owner nothing about a feature which is already in the delivery pipeline. That sort of information is more useful when deciding if and where to prioritise a feature, or if the PO has (for whatever reason) never looked at their competition's offerings.
Once it's at the stage of the pipeline I presumed it to be at, it's very unlikely that someone is going to slap their forehead and say "Oh! We should just ship that, instead of sitting on the ready-to-go feature in order to feast on the tears of our users" (or whatever reaction one expects from such comments).
This by far the most 👍'd feature request in the Coming Soon bucket (by a multiple of 5 from its next-closest) and I certainly assume that the person/people managing this backlog can count.
Community Note
Tell us about your request Add support for tainting nodes through managed node groups API
Which service(s) is this request for? EKS
Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard? Managed nodes supports adding Kubernetes labels as part of node group creation. This makes it easy for all nodes in a node group to have consistent labels. However, taints are not supported through the API.
Are you currently working around this issue? Manual kubectl commands after new nodes in node group come up.