Open hendrikhalkow opened 3 years ago
@mikestef9 This proposal applies especially to nodes that are managed outside EKS, sometimes called unmanaged nodes.
By means of a visual example, here's the node level view of a simple EKS cluster with 3 distinct nodegroups (db, logs, app) using Lens
having the ability to set role labels to some meaningful value, at the time the instance is provisioned, will dramatically increase the usability of these tools; everything helps to reduce the time-to-resolution
@adeatpolte
here is my workaround, I don't think it's possible to set these in EKS in any other way:
looks like this:
@adeatpolte
here is my workaround, I don't think it's possible to set these in EKS in any other way:
looks like this:
Thanks for the workaround suggestion, I'll have to give this a shot and see if it causes me any issues, I was thinking of trying something similar but at least I have a starting point now if I do.
I definitely think this is kind of needed, issue still open so maybe we can make some progress. I don't mind if it's implemented securely in some way of course but, in my case, I didn't even get an error and everything kinda broke:
I was able to set the node role labels using the AWS EKS Terraform module for both self and EKS managed node groups..
self_managed_node_groups = {
my_self_node_group = {
bootstrap_extra_args = "--node-labels=node-role.kubernetes.io/db_role=db_role"
}
}
eks_managed_node_groups = {
my_eks_node_group = {
labels = {
"node-role.kubernetes.io/other_role" = "other_role"
}
}
}
And it did show after applying the configuration and there wasn't initially any issues, however, any nodes that were recreated subsequently were not able to join the cluster nor was I able to pin anything down in the logs saying why.
I can't say for certain yet whether that's just a result of having it set before nodes join or if it would affect anything if only set after.
My main suspicion was that certain required add-ons I'm using like CoreDNS or VPC-CNI were no longer able to start on the nodes that they should have due to the role label, however, I did try specifying affinity or node selectors on the add-ons that seemingly allowed those options (from scraping aws eks describe-addon-configuration --addon-name <name> --addon-version <version> --quiery 'configurationSchema' --output text | jq
) and after some fudging on making sure they were happy with the injected config and possibly seemingly actually following it, I still wasn't able to get nodes to join.
This happened with both self and eks managed nodes with the above mentioned method of configuring the labels.
It took me a long time to figure out what the issue was even as after initially applying the config, I could see the roles and everything was fine and then I went on to upgrade EKS to 1.27 with my self managed nodes being on Ubuntu's EKS image @ 1.26 as that was still the latest, updated some tf module versions, made a few nodes a smaller type (which actually ended up having those nodes, which are the ones that host CoreDNS, hit their pod limit super early due to IP limits of the smaller node and having not had prefixes enabled) and a few other kind of 'everyday' changes I didn't think would really affect things and then bam everything was broken and as the roles seemed to initially work, I blamed everything else and even after I figured it out I ran into the Pod limit/IP limit issue and also had issues with nodes joining and that almost invalidated me thinking the label had anything to do with my issues either but I went back and double-checked the label and nodes still wouldn't join even with affinity and node selectors set for the add-ons.
I'm glad I'm not alone in this frustration; I understand the potential security issue but considering everything shows roles in node lists, having them all blank seems to defeat the purpose of even showing the roles. I eventually gave up for the time being as essentially it was more an aesthetic issue than a functional one as I can still make things go where I want with a label like node.kubernetes.io/role=myrole
instead, but then it doesn't show in node lists without extra work like telling kubectl to show certain labels.
Have you had any particular issues with your 'fix' yet, like nodes not joining? not sure if you're using any of the AWS EKS add-ons nor am I sure they're specifically the culprit in my situation either. I was even myself thinking of having a workaround like this that set the node labels after they are created and joined but I want to be sure I won't have issues otherwise.
Even then, it still isn't great that I can't use the node-role.kubernetes.io
label on spin-up/deploy as besides the aesthetics, I wanted to have certain add-ons deploy to specific nodes based on their role; not all the EKS add-ons even have the ability to set affinity, nodeSelectors or tolerations but some do and it seems to be something that's come along recently - some add-ons need to have pods running on every node like the VPC-CNI and the node
side of the EBS-CSI driver (although the controller
s for the EBS-CSI driver don't) and that too means any of those add-ons will need some method of allowing that configuration as even with a workaround like this where the node role is set after the nodes join (and thus likely also after the add-on's pods are deployed) if those add-on's pods/daemonsets/services/etc are restarted, upgraded or scaled they will likely break.
Thus far, I've been lucky that the order in which Terraform does everything that even without any specifiers that my nodes always went where I wanted them to, but I'm sure that won't be the case when more node groups are added.
Checking now out of curiosity for latest versions as of this post:
aws-ebs-csi-driver shows affinity
, nodeSelector
and tolerations
for both it's controller
and it's node
pods.
coredns shows affinity
, nodeSelector
and tolerations
in it's config.
vpc-cni shows affinity
and tolerations
in it's config.
adot shows affinity
, nodeSelector
and tolerations
in it's config.
aws-guardduty-agent has nothing, no config, at all, even unrelated, period.
kube-proxy has nothing of the sort, though I suppose this could be alternatively deployed with a Helm chart but would likely require extra configuration.
vpc-cni
only had the extra config values added in the most recent version as well.
So, if there's no other issues than once kube-proxy
supports this as well, perhaps it'll be a reasonable workaround.
any nodes that were recreated subsequently were not able to join the cluster nor was I able to pin anything down in the logs saying why.
I ran into this exact issue and ended up not using node roles with my EKS managed node groups. I didn't understand the reason why setting these role labels would break things.
Just in case anyone's planning to try this, node-role.kubernetes.io/db_role
and node-role.kubernetes.io/other_role
aren't official Kubernetes labels. You can use a private label key like node-role
(no /
) without needing to register it with anyone, or you can use your own domain name, eg example.net/my-node-role
.
Just in case anyone's planning to try this,
node-role.kubernetes.io/db_role
andnode-role.kubernetes.io/other_role
aren't official Kubernetes labels. You can use a private label key likenode-role
(no/
) without needing to register it with anyone, or you can use your own domain name, egexample.net/my-node-role
.
The biggest problem is that these unofficial labels are now kind-of a convention for multiple tools, including kubectl
. So it's convenient to rely on it, instead of using private labels that most of the visualization tools won't recognize by default.
I understand there are security concerns around allowing Kubelet to set those labels, and the upstream community has decided to park it. For those willing to understand more about it, I'd recommend: https://github.com/kubernetes/kubernetes/issues/75457.
kubectl
can't warn when you use an unregistered label, because we can't hard code the list, and because kubectl has a high bar around never emitting a warning that could ever be wrong.Underscores in the key for a label are unusual. If you're making a tool that interacts with labels, you can warn your users if they try to use a node role label that doesn't look right.
Also, for clarity:
node-role.kubernetes.io/control-plane
is a valid label. The allowed values are "true" and "false"node-role.kubernetes.io/<anything else>
isn't registered. The label is only used to identify control plane nodes.
node-role.kubernetes.io/master
is also registered@sftim I agree 100% with you, and my point was to share a perspective of why so many people ask for this. ☺️
Community Note
Idea
Since Kubernetes 1.16, the Kubelet is not allowed anymore to provide node labels by itself due to security reasons. However, with Cluster Autoscaler or with a maximum EC2 instance lifetime it is not feasible to label every node by hand. There is a discussion going on how to auto-label nodes based on tags, but I don't think this is the right approach, because it would re-introduce the security issue why this change was made.
To apply the node role label both automatically and securely, the node must provide an identity evidence – it's IAM identity – to a cluster component, which would perform the labelling. So here are a few ideas I have how this could be built:
Create a
DaemonSet
retrieves the node's IAM role and a custom resource to specify which node labels should be applied to that IAM role. Based on that information, the operator can perform the automatic labelling.Or you build that directly into the EKS control plane with a new IAM action like
eks:LabelNode
, which would be performed when the Kubelet starts up. As there is already some authentication mechanism in place when the node joins the cluster, it could be integrated there, too. Instead of creating a custom resource, theaws-auth
config map could be extended, too.Looking forward to type
kubectl get nodes
and to quickly identify my node roles again :-)❗ Update: This feature request applies especially to nodes that are managed outside EKS, sometimes called unmanaged nodes.