Open bhops opened 4 years ago
@daenney will EKS terraform module be also updated to fix this? https://registry.terraform.io/modules/terraform-aws-modules/eks/aws/latest
@morsik No idea, I don't use that module and am not one of the maintainers. I'd suggest raising that on their issue tracker: https://github.com/terraform-aws-modules/terraform-aws-eks/issues.
Passing the managed node tags to launch templates "Instance tags" will automatically apply to both EC2 and its volumes. If there are some challenges to do that, creating a separate "Custom Tags" section in the EKS managed node configuration page will also be helpful.
Can you please help identify where the "Custom Tags" section exists?
You haven't mentioned what tool you're using, if any, but at the EC2 API level, that was probably referring to TagSpecifications
in the RequestLaunchTemplateData
object used with the CreateLaunchTemplate
and CreateLaunchTemplateVersion
APIs.
That's what's used for terraform-provider-aws's launch template implementation, and eksctl's managed and unmanaged node group implementations.
The issue with aws_autoscaling_group_tag
is, in order to be able to loop over the ASGs they have to exist. So one can't write modules to create node groups and tag the auto scaling groups in one go. It will always require a --target apply
due to a Terraform limitation.
The only viable approach is to:
I understand from the thread that option 2 is preferred. But I can't find any PR or issue regarding this in the CA. Is AWS working on this?
@pst I've asked a similar question in https://github.com/aws/containers-roadmap/issues/724.
Why is this issue controversial? Many company need tag on ASG for cost and compliance reason.
Same here. ASG tagging is mandatory in my company for cost and compliance.
I find it useful to use the launchtemplate configuration for the terraform eks module. This way I'm able to tag the underlying instances. Tagging the instances in my case is more valuable that tagging the ASG it self.
See TF example here.
@newb1e
I'll say it again: I don't want to monitor EC2, I want to monitor the AutoscalingGroup itself. https://github.com/aws/containers-roadmap/issues/608#issuecomment-862017125
Is this even still being looked at and worked on? For us ASG tagging is a compliance issue. I know there are workarounds using lambda and such, however our enterprise has a ton of accounts and having this be the default behavior of nodegroups would make our lives so much simpler. Doesn't seem like a big ask.
It's almost two years for just setting a simple tag... which turns out to be critical for compliance, security, scaling node groups...
I don't think it's even been "worked on", it's still in the 'Researching' stage, and probably lost a lot of momentum when it was thought that the value was only for cluster-autoscaler, or for tags on ASGs only to propagate to nodes, both of which are being addressed elsewhere. Or at least that's my impression from https://github.com/aws/containers-roadmap/issues/608#issuecomment-675101463.
Disappointingly, since @mikestef9 asked in that comment a year ago for more ASG-tagging use-cases, we haven't heard any follow-up (or even acknowledgement) on the various use-cases that have been shared here.
@TBBle there are a lot of ASG tagging use cases in this thread before the @mikestef9 comment that launch template tagging wont address, there are also a lot of the ASG tagging use cases called out in #724 when discussing the tags for CA.
Looking back, the only comment I can see that requested something other than node-tagging via tag propagation, or Cluster Autoscaler tags for scale-from-zero, was https://github.com/aws/containers-roadmap/issues/608#issuecomment-670641308 10 days before @mikestef9's comment, and well after the development work that delivered Managed Node Group Launch Template support would have happened to address the needs called out in this ticket.
A quick skim doesn't show anything in #724 up to that date that isn't specifically about CA scale-from-zero, and if there is such a thing, overlooking it in the context of the rest of the discussion would be pretty easy.
Even the original feature request of this ticket was tagging ASGs for propagation to nodes and volumes.
@TBBle I was sure I'd read a number of requests about compliance policies requiring ASGs to be tagged, but both of these issue threads are so long it's hard to parse the data out of them, but that'd be the one I'd be putting forward.
I'm still of the opinion that something as trivial as cascading tags from MNGs to ASGs should have been implemented as part of the MVP. If there is a better way to handle the same use cases it needs to actually exist and then it can be adopted by virtue and not as the only solution (or no solution).
The threads are long, but the vast majority of activity was after the comment from @mikestef9 that we were talking about, in August 2020. It's about 20 comments in, compared to the fortyish that have come after that. And #724 spent most of its time before August 2020 talking about actually setting minSize to 0 (and directly modifying the ASG to work around this); the tagging discussion only came up there once, in July 2020, until people were referred to there from here by that comment.
Anyway, looking at "should have been" isn't very valuable here. This is containers-roadmap
, not containers-if-we-could-turn-back-time
. What's interesting is what is going to be done; currently, that's an empty space, and the rest of the discussion just makes it more likely that the actual use-cases will be lost in the noise and nothing will continue to happen. -_-
(If I could turn back time on this ticket, I'd have opened a different ticket for tagging ASGs for their own sake back when I subscribed to this ticket; at the time I didn't realise that the node-group-tag-propagation part of the request was going to be the need that was fulfilled. Similarly I regret not leaving a comment for my at-the-time use-case way back then too)
@TBBle @stevehipwell I created a separate issue to track the request for ASG tagging (for compliance and cost reasons). This is quite important for many organizations to adopt EKS.
the EBS CSI controller needs the tag topology.ebs.csi.aws.com/zone
. The PVs created with nodeAffinity
and expects the nodes to have topology.ebs.csi.aws.com/zone
, when the ASGs have 0 instances, cluster autoscaler needs ASGs to have those tags..
nodegroups manage ASGs, nodegroups should be able to tag ASGs
aws-node-termination-handler Queue mode needs ASGs to be tagged with Key=aws-node-termination-handler/managed
. Ability to propagate tags from MNG to ASG will make it easier for users.
@nkk1 There's a proposal/discussion for Cluster Autoscaler to assume/populate topology.ebs.csi.aws.com/zone
on the scale-from-zero template using the AWS backend. Since the label's auto-added by the CSI Controller, I think that'll be less surprising for the user, and frankly will probably land sooner than ASG user-defined tagging support.
If CA doesn't accept that proposal, then user ASG tagging be the only way to make scale-from-zero work correctly with Managed Node Groups and the EBS CSI Controller, since the effort to migrate away from that label seems to have failed.
@askulkarni2 I think the guidance is to not use NTH with MNGs, @bwagner5 can probably shed a bit more light here.
@stevehipwell @askulkarni2 That is correct. NTH is not needed when using managed node groups for handling terminations. MNG already gracefully handle terminations due to ASG scale-down, spot interruptions, and capacity rebalance.
@bwagner5 fantastic! Thanks for the insight. And thanks @stevehipwell for bringing it to attention.
That's really crucial for several reasons, following are some of the issues we're experiencing due to this limitation.
cluster-autoscaler
tags for properly scaling node groups based on labels and taints.So I really hope there will be a proper solution instead of some workarounds.
This is a really important feature!
Let's say we need to tag the ASG to make the cluster auto scaler work like a charm, check tracking cost, resource ownership, ....
resource "aws_autoscaling_group_tag" "tag_cpu_ng" {
autoscaling_group_name = aws_eks_node_group.cpu_ng.resources[0].autoscaling_groups[0].name
tag {
key = "k8s.io/cluster-autoscaler/node-template/taint/X"
value = "NoSchedule"
propagate_at_launch = true
}
}
And using CloudFormation?
π βΉοΈ Managed ASG tagging is now implemented in eksctl with https://github.com/weaveworks/eksctl/pull/5002 and should land in the next release.
I am waiting for this feature to be natively supported by AWS. While eksctl is an excellent solution, it is not suitable for use in the context of Infrastructure as a Code and needs continued support to be available in solutions such as CloudFormation and Terraform.
@andre-lx 's answer is spot on. Using the "aws_autoscaling_group_tag" resource worked for me. But it only worked for new nodes. So I just cycled out my existing nodes one-by-one, and the new nodes were all tagged as they should. For instance, this is my set up for creating a node group, and creating a aws_autoscaling_group_tag that sets the "Name" tag which shows up in ec2.
resource "aws_eks_node_group" "nodes_group" {
cluster_name = aws_eks_cluster.eks_cluster.name
node_role_arn = aws_iam_role.eks_assume_role.arn
subnet_ids = var.subnet_ids
###########
# Optional
ami_type = "AL2_x86_64"
disk_size = 60
instance_types = ["m6i.xlarge"]
node_group_name = "worker"
version = var.kubenetes_version
scaling_config {
desired_size = 2
max_size = 4
min_size = 1
}
update_config {
max_unavailable = 2
}
# Ensure that IAM Role permissions are created before and deleted after EKS Node Group handling.
# Otherwise, EKS will not be able to properly delete EC2 Instances and Elastic Network Interfaces.
depends_on = [
aws_iam_role_policy_attachment.EKS-AmazonEKSWorkerNodePolicy,
aws_iam_role_policy_attachment.EKS-AmazonEKS_CNI_Policy,
aws_iam_role_policy_attachment.EKS-AmazonEC2ContainerRegistryReadOnly,
]
}
#EKS can't directly set the "Name" tag, so we use the autoscaling_group_tag resource.
resource "aws_autoscaling_group_tag" "nodes_group" {
for_each = toset(
[for asg in flatten(
[for resources in aws_eks_node_group.nodes_group.resources : resources.autoscaling_groups]
) : asg.name]
)
autoscaling_group_name = each.value
tag {
key = "Name"
value = "eks_node_group"
propagate_at_launch = true
}
}
In all honesty, Terraform should at least explicitly state in the documentation that their Tag doesn't work for setting the "Name" tag, as that is a key tag that lots of companies use to organize instances and manage billing. Personally I think not having the tags parameter override the "Name" tag is a bug. But... but I'd at least settle for better documentation that describes this work around.
Can you at least update the documentation so other people don't have to waste as much time on this?
Pretty please?
Guys, the resource responsible for EC2 tags is the "Resource Tags" of the Launch Template. I have the same problem and i have verified that its not possible to add tags in the Launch Template automatically generated by Terraform. To create the Tags for this resource, we need to provision our own Launch Template.
However, I managed to at least Tag EC2 with the "Default tags" from my repository. Here is the code used:
data "aws_default_tags" "default_tags" {}
# Add 1 Tag for the 1 or more node groups
resource "aws_autoscaling_group_tag" "tag_aws_node_termination_handler" {
for_each = toset([
aws_eks_node_group.node_group_name1.resources[0].autoscaling_groups[0].name,
aws_eks_node_group.node_group_name2.resources[0].autoscaling_groups[0].name
])
autoscaling_group_name = each.value
tag {
key = "aws-node-termination-handler/managed"
value = "PropagateAtLaunch=true"
propagate_at_launch = true
}
}
# Add tags for 1 node group
resource "aws_autoscaling_group_tag" "default_tags_asg_high" {
count = length(keys(data.aws_default_tags.default_tags.tags))
autoscaling_group_name = aws_eks_node_group.node_group_name.resources[0].autoscaling_groups[0].name
tag {
key = keys(data.aws_default_tags.default_tags.tags)[count.index]
value = values(data.aws_default_tags.default_tags.tags)[count.index]
propagate_at_launch = true
}
}
I just came here to say this issue has been open for 900 days and is the 3rd most π'd in the project.
Just found this, again in conversation with our AWS AM. kind of a dup of #724
As @SlevinWasAlreadyTaken mentions now available in eksctl thanks to his hard work in https://github.com/weaveworks/eksctl/pull/5002
Some bash to workaround https://github.com/aws/containers-roadmap/issues/724#issuecomment-1077733733 Some terraform in that comment chain too.
Edit: I don't work for AWS, complain to your account manager. I wholeheartedly agree this is ridiculous.
Hey guys.
Here at YData we built a solution to solve this issue till the official release. We created an AWS lambda that can tag the EKS Node groups ASG with common tags or specific tags per node group using CloudFormation.
This can be used to add tags for labels and taints (cluster-autoscaler) or any other tags that can help for check tracking cost, resource ownership, ...
Fell free to test and give us your review.
https://github.com/ydataai/aws-asg-tags-lambda
You can use it directly in the template as:
ASGTagLambdaFunction:
Type: AWS::Lambda::Function
Properties:
Role: !GetAtt ASGTagLambdaExecutionRole.Arn
PackageType: Image
Code:
ImageUri: !Ref EcrImageUri
Architectures:
- x86_64
MemorySize: 1024
Timeout: 300
ASGTagLambdaInvoke:
Type: AWS::CloudFormation::CustomResource
DependsOn: ASGTagLambdaFunction
Version: "1.0"
Properties:
ServiceToken: !GetAtt ASGTagLambdaFunction.Arn
StackID: !Ref AWS::StackId
AccountID: !Ref AWS::AccountId
Region: !Ref AWS::Region
ClusterName: "the EKS cluster name" #!Ref EKSCluster
CommonTags:
- Name: "ENVIRONMENT"
Value: "dev"
PropagateAtLaunch: true
NodePools:
- Name: "system-nodepool" #!GetAtt YourNodeGroup.NodegroupName
Tags:
- Name: 'k8s.io/cluster-autoscaler/node-template/taint/TAINT'
Value: 'NoSchedule'
PropagateAtLaunch: true
- Name: 'k8s.io/cluster-autoscaler/node-template/label/LABEL'
Value: 'LABEL_VALUE'
PropagateAtLaunch: true
- Name: "another-pool"
Here at YData we built a solution to solve this issue till the official release.
Each of us already has such a solution. We do not want to do that in the future, so we are requesting official support for such a solution.
https://github.com/aws/containers-roadmap/issues/608#issuecomment-862017125
AWS has always had AutoscalingGroups, and has always had resource tags. We just want to take advantage of it. AWS don't need to develop anything additional. Why can't managed services take advantage of these features? Am I making a strange request?
I wanted to share a recent launch from the EKS team that might be of interest to folks following this issue. Earlier this week we released Cluster-level Cost Allocation Tagging:
With this launch, all EC2 instances which join an EKS cluster are automatically tagged with an AWS-generated cost allocation tag [containing the EKS cluster name]. Any EC2 instance used in an EKS cluster will be tagged automatically without any additional action required, regardless of whether they are provisioned using EKS managed node groups, Karpenter, or directly via EC2. This tag can be used to allocate EC2 costs to individual EKS clusters through AWS Billing and Cost Management tools...
While this feature won't help to propagate customer-defined tags down to the EC2 instances in an EKS cluster, for those of you who are looking for better cost allocation across multiple EKS clusters, this feature will reduce the work required.
While this feature won't help to propagate customer-defined tags down to the EC2 instances in an EKS cluster, for those of you who are looking for better cost allocation across multiple EKS clusters, this feature will reduce the work required.
Thanks for sharing this update, I am sure some will find it useful. I don't want to shoot the messenger and I know you are just trying to help but this really is attacking the problem from the wrong end is it not? :man_shrugging:
EC2 instances that are not a part of a managed node group are already easily taggable with customer-defined tags and managed node groups already had this tag that would be passed through for cost-allocation. From my perspective this change does little to reduce work required if you want to use managed node groups.
I only know this because of having to write code for a previous employer to process this tag to retrieve the customer defined tag values so costs could be allocated in the same way as everything else. Luckily we had well structured cluster names that included the values required. However, it is brittle and the processing broke a few times when cluster name structure was changed for operational reasons (eg to add blue/green support to our automation for EKS upgrades).
Please consider following through on this issue (getting close to 3 years now). We always get told by our account managers, TAMs and SAs to use tags, it would be nice if tagging actually worked for cost allocation for all resources, EKS or otherwise. Thanks.
Can someone please help us understand why this is not getting any traction with this much attention? It appears to be still in the "researching" phase. Just ran into this when trying to scaling from 0 on managed nodes groups using the terraform eks module.
Honestly you should all move to Karpenter.
Another vote for karpenter
Honestly you should all move to Karpenter.
Honestly that has to be one of the least helpful suggestions I have seen in while. We are talking about Managed Nodes which are not going to magically get tagged if you install Karpenter in the cluster and start launching nodes (you also need running nodes for Karpenter's controller to run on, bit of chicken and egg problem). Karpenter is a nice tool for sure but it is not a solution to this issue.
you also need running nodes for Karpenter's controller to run on
Customers can run Karpenter on Fargate (managed compute). This helps eliminate the bootstrapping problem. However, resource tagging is not yet available for Fargate on EKS. If Karpenter is the only thing running on Fargate, this might be acceptable for cost-allocation purposes.
Karpenter has some nice features but is a bit more complex for probably most people's needs since it geared for clusters with larger workloads that can benefit from more advanced scheduling and continuous resource optimization. One thing I don't like is Karpenter's requirement to know specific node group info vs just using static tags with the cluster name. Thanks for the suggestion but for those that want to continue to use CA it would be nice to see some tags on the autoscaling groups to solve this.
Going off topic here, but curious about this
One thing I don't like is Karpenter's requirement to know specific node group info vs just using static tags with the cluster name.
What do you mean? Karpenter doesn't care about anything other than pending pods to be allocated. Or you can go overboard and create very scoped provisioners per team or deployment label
Going off topic here, but curious about this
One thing I don't like is Karpenter's requirement to know specific node group info vs just using static tags with the cluster name.
What do you mean? Karpenter doesn't care about anything other than pending pods to be allocated. Or you can go overboard and create very scoped provisioners per team or deployment label
I might be wrong but was just looking at their install documentation and noticed they wanted "--set aws.defaultInstanceProfile=KarpenterNodeInstanceProfile-${CLUSTER_NAME} \" in the helm chart but guessing maybe there is a way to just use irsa instead. I also see there is an application of a manifest after you run a helm chart which is a bit odd but these are just installation related issues, and off topic as you said.
@flowinh2o I think aws.defaultInstanceProfile
is the instance profile for the nodes being created so they can join the cluster.
Running dedicated node groups for specific workloads requires a couple of cluster-autoscaler tags for properly scaling node groups based on labels and taints.
Exactly our problem also. Details here.
This looks like such a must-have feature thats super simple to implement if you allow to just propagate node_group tags to asg...
Any updates on this? Also needed here
Also having a problem with this - I wouldn't expect in a service called Managed Node Groups that I would have to work around tag propagation issues. It seems vendors such as weaveworks have implemented their own workarounds, sadly there is no such workaround in terraform.
This issue is pretty fundamental - would like it fixing please :)
It seems vendors such as weaveworks have implemented their own workarounds, sadly there is no such workaround in terraform.
@RogerWatkins-Anaplan it's easy enough to do this with Terraform and I think there are links in some of the comments above on how to do it. That said you wouldn't expect to need to do this for a first party vendor solution.
Are the guys working on this? This is a must-have for us to implement the project
Community Note
Tell us about your request It would be great if we could pass tags to the underlying ASGs (and tell the ASGs to propagate tags) that are created from the managed node groups for EKS so that the underlying instances/volumes are tagged appropriately.
Which service(s) is this request for? EKS
Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard? Currently, managed node groups for EKS do not support propagating tags to ASGs (and therefore the instances) created by the node group. This leads to EC2 instances that are not tagged according to our requirements for tracking cost, and resource ownership.