aws / karpenter-provider-aws

Karpenter is a Kubernetes Node Autoscaler built for flexibility, performance, and simplicity.
https://karpenter.sh
Apache License 2.0
6.79k stars 955 forks source link

Dynamic tags support based on resource annotation #2783

Open AlonAvrahami opened 2 years ago

AlonAvrahami commented 2 years ago

Tell us about your request

Today Karpenter support defining tags only via the Provisioner configurations (spec.tags) or the AWSNodeTemplate configurations (spec.tags). When running many types of applications workload, it would be awesome to support a way to set EC2 instance (and all other relevant resources) tags dynamically by providing the relevant tags as resource annotation. When providing such an annotation, Karpenter will merge the tags with the tags from both Provisioner and AWSNodeTemplate configurations.

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?

Instead of defining multiple providers with different tags (per application for example) it would be awesome to add support for dynamic tags by using the resource annotation. With the feature i can use fewer Provisioners and give a better visibility and control over the resource allocation and usage, mainly for cost analysis. (when you run an application and you would like to analyze how much it cost to run this specific application)

Are you currently working around this issue?

Currently I'm using terraform for defining all of the Karpenter provisioners, so I'm creating many provisioners for different applications where the only differences between them are the resource tags.

Additional Context

No response

Attachments

No response

Community Note

jonathan-innis commented 2 years ago

Do you have any details on exactly which tags are being used that differ per-provisioner with an example as well as what you had in mind for how we might be able to enable these tags to be generated dynamicaly?

AlonAvrahami commented 2 years ago

Do you have any details on exactly which tags are being used that differ per-provisioner with an example as well as what you had in mind for how we might be able to enable these tags to be generated dynamicaly?

Yes, for example if i want to request a new instance in a specific region, i want it to add the relevant region tag to the EC2. A more specific use case could be an ML application that get a batch of tasks, and i want it to add a tag of this specific ML application in order to calculate how much resources have been used in order to perform this batch of tasks.

Currently I'm trying to use one provisioner that will provision nodes for all types of ML application, if an application using a specific node i would like to add a tag that will mark that node X is being used by application A. so in the end result i can add a tag: Application A = True.

In anyway, I'm also trying to install KubeCost that should help with the cost visibility.

jonathan-innis commented 2 years ago

This is interesting. So to distill this, you would basically like to let the pods/deployments/etc. determine what the tags of the node are with some kind of nodeSelector or nodeAffinity, rather than this be purely determined at the Provisioner-level.

Maybe something like tags.karpenter.k8s.aws/tag1: value1 modeled as a requirement that converts from a Kubernetes label to a AWS-based tag.

jonathan-innis commented 2 years ago

Currently, the delineation for multi-tenant scenarios is based on Provisioners, meaning that a Provisioner defines a boundary (or a world) of potentially provisioned nodes for some entity (team, cost center, etc.).

Currently I'm trying to use one provisioner that will provision nodes for all types of ML application

What's the reason you want to distill this all down to a single provisioner?

jonathan-innis commented 1 year ago

A more specific use case could be an ML application that get a batch of tasks, and i want it to add a tag of this specific ML application in order to calculate how much resources have been used in order to perform this batch of tasks.

If you have a specific team that is responsible for the workloads/nodes why not section this off through the Provisioners taints/requirements, meaning you would need a different provisioner, and could control the tags through that level rather than at the workload level?

For one, it's not particularly clear how workload-based tagging would work since it would be possible for workloads from one team to share a node with workloads for another team, in which case, the tagging becomes less meaningful.

yaroslav-nakonechnikov commented 1 year ago

i can add scenario where adding some "dynamic" tags will be usefull.

we have setup with application, which can be clusterized. We need to make sure that all running nodes are same, and single provisioners solves it perfecly.

but on these nodes we have running pods with application, which is part of cluster. name looks like: appname-site1-approle. and it can have several sites, depends on how many replications are needed.

so, in our case it will help operations team a lot to see all nodes with site1, or site2. atm to solve it they need to login to some control plane, and run kubectl commands. Which we would lile to omit.

so we thought to add adding tags of some fields on pods, to show what is being added to instance.

yes, we can achieve it with several provisioners, but in that case we are getting lots of duplicated code.

jonathan-innis commented 1 year ago

i can add scenario where adding some "dynamic" tags will be usefull

The biggest issue with dynamic tags based on applications is that Karpenter doesn't do the binding to nodes, so we can't guarantee that the nodes that we launch and the pods that we think will schedule to them will actually occur. Effectively, if we did tagging on creation in this manner, we have a large chance of being wrong. On top of this, it's possible for pods to be pre-empted and moved around on the cluster that would definitely break this tagging behavior.

I'm wondering if a better solution might be a separate "tagging controller" from Karpenter that monitors applications and modifies their tags in EC2 after the instance is launched and pods are bound.

gaboc623 commented 1 year ago

Facing a similar situation where having many provisioners for specific tagging is resulting in the controller slowing down. The idea of a "tagging controller" sounds like it would be useful for this.