Open elmiko opened 7 months ago
Looks like there are a couple other issues that were asking for similar functionality here: https://github.com/aws/karpenter-provider-aws/issues/1941 and https://github.com/kubernetes-sigs/karpenter/issues/660
What's the security implication of allowing users to set this custom label? From a little sleuthing, it doesn't seem to be any different than the node-restriction.kubernetes.io
label that we already support today.
As Karpenter is a Kubernetes project, I'm wary of even indirectly encouraging use of unregistered labels.
If node-role.kubernetes.io/worker
should become an official label, let's get that registered first - it's not a particularly difficult process. However, registration first, then adoption.
“OpenShift uses it” does not count as grounds to add support into Kubernetes code.
OpenShift folks, if you want a label to mark nodes as OpenShift workers, you can make one, eg openshift.example/worker: true
security implication
Nodes either can or can't set arbitrary labels on themselves, depending on what controls you've set. So long as we leave the kubelet to set that label, we're fine. If Karpenter starts setting a label on a node, it could provide a way to bypass restrictions on what labels nodes can have (although, letting someone write to your NodePools is already bad practice).
As Karpenter is a Kubernetes project, I'm wary of even indirectly encouraging use of unregistered labels.
this makes me wonder if the kubelet should be creating warnings/errors if the user attempts to use an unregistered well-known label?
karpenter ~is~ may be preventing something that the kubelet allows.
If node-role.kubernetes.io/worker should become an official label, let's get that registered first - it's not a particularly difficult process. However, registration first, then adoption.
totally agreement from me, but this is historical as well. openshift has been using the label for a long time, i only mention this to help add context around the effort to change the value.
“OpenShift uses it” does not count as grounds to add support into Kubernetes code.
agreed, but let's take openshift out of the equation. as a user, i would like karpenter to create nodes with unregistered labels from the kubernetes.io prefix, i can do this today on the kubelet command line and i would like karpenter to do the same for me.
i think if this sounds like an unreasonable request at the karpenter level, then we should consider making the kubelet produce warnings when this happens because we are now asking karpenter to gate activity that the kubelet allows.
OpenShift folks, if you want a label to mark nodes as OpenShift workers, you can make one, eg openshift.example/worker: true
no objection from me, but i think this is orthogonal to the true issue here. i only mentioned openshift to help illuminate the use case.
So long as we leave the kubelet to set that label, we're fine. If Karpenter starts setting a label on a node, it could provide a way to bypass restrictions on what labels nodes can have (although, letting someone write to your NodePools is already bad practice).
i think this question is important, if karpenter is adding the label after node creation, then i totally agree with it preventing unregistered labels from being added. but if karpenter is passing the labels down to the kubelet configuration, then i think we should reconsider.
thanks for the thoughtful discussion @sftim and @jonathan-innis
one additional thought on this, even if karpenter cannot create nodes with the unregistered labels, it would be nice to able to have it know when a node will be created with that label so that it can properly predict which nodes should be created.
essentially, i am fine with karpenter not placing the unregistred label on the node, but i should be able to tell karpenter "hey if you see these pods, put them on nodes will this label, and oh btw these node classes will have that label".
We have code in Kubernetes to prevent the node setting disallowed labels. You can use Karpenter to bypass that; it's your cluster.
However, should we encourage it? I say we should not. Upstream Kubernetes and in-project code should abide by our architecture standards, even if individual end users intend to disregard them.
TL;DR: offering an in-project security footgun feels like an obviously bad choice.
However, should we encourage it? I say we should not.
agreed
TL;DR: offering an in-project security footgun feels like an obviously bad choice.
just to be clear, do you consider the footgun to be: karpenter adding disallowed labels after the node is created, karpenter adding disallowed labels to the command line flag list for kubelet, or both?
reading this a little deeper, https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/#noderestriction
i have a question about this interaction with the admission controllers, if a user has disabled the admission controller for node restriction, will that change be reflected in karpenter's restriction as well?
it seems like we would want karpenter to reflect what the admission controllers are doing, so if an update to a node object gets rejected wouldn't we want to surface that error to the user?
I suppose it's an outright footgun if a compromised node can, in any way, set labels on itself it shouldn't be able to, or cause those to be set.
I'm also opposed to tacitly encouraging people to use Kubernetes-namespaced labels that Kubernetes doesn't recognize. We shouldn't staff work that encourages doing more of the wrong thing.
i have a question about this interaction with the admission controllers, if a user has disabled the admission controller for node restriction, will that change be reflected in karpenter's restriction as well?
No; in general, Karpenter cannot know what admission controllers are in use.
No; in general, Karpenter cannot know what admission controllers are in use.
perhaps i phrased that poorly, i'll try to restate.
imagine a user has the noderestriction admission controller disabled. the user configures karpenter to make changes that would otherwise be prevented by the admission controller. karpenter refuses to allow the changes because it has duplicated the default rules for the noderestriction admission cotroller. should karpenter be the component responsible for rejecting the configuration?
i'm wondering if karpenter should be returning the errors from updating nodes/pods instead of interposing itself for admission.
We shouldn't staff work that encourages doing more of the wrong thing.
agreed, and this makes a lot of sense to me.
should karpenter be the component responsible for rejecting the configuration?
I think so, because it's much easier to maintain an important security guarantee this way. It sounds like you think Karpenter is an admission controller; it's not.
It sounds like you think Karpenter is an admission controller; it's not.
i don't think i do, but i am trying to reconcile why the rules for the unregistered labels are encoded/validated in karpenter instead of deferring to the apiserver to return an error. (perhaps i am not understanding this flow properly).
i don't think i do, but i am trying to reconcile why the rules for the unregistered labels are encoded/validated in karpenter instead of deferring to the apiserver to return an error. (perhaps i am not understanding this flow properly).
The API server's restrictions (from the node restriction admission controller) apply exclusively to label changes made by nodes. Karpenter is not a node, so those restrictions wouldn't be enforced.
exclusively to label changes made by nodes
that's what i was missing, thank you!
/remove-label needs-triage
@engedaam: The label(s) /remove-label needs-triage
cannot be applied. These labels are supported: api-review, tide/merge-method-merge, tide/merge-method-rebase, tide/merge-method-squash, team/katacoda, refactor
. Is this label configured under labels -> additional_labels
or labels -> restricted_labels
in plugin.yaml
?
/triage accepted
happy to see this considered for inclusion, but given the conversation between @sftim and myself, i'm not sure this is a good idea to carry forward. it seems like a possible point for a security escalation.
I think we should ask the OpenShift folks to consider not using the wrong label, and leave it at that. OpenShift is entirely welcome to use OpenShift labels for node role, it just shouldn't bypass K8s processes for registering labels.
For people using Karpenter who want to work around this, they could use a different label (eg internal-node-role: worker
) and a contrib controller that copies internal-node-role
to whatever other label they want set. But we shouldn't make that contrib code part of Kubernetes either, because it'd be furthering a mistake.
So given that Karpenter already applies Taints to Node objects upon creation, I can't see why it would be any different to allow Karpenter to apply labels (including node-role.kubernetes.io/<role>
).
Kubernetes enforces the rules here; a node autoscaler, as a client of the Kubernetes API, isn't able by itself to bypass the security rule.Message ID: @.***>
Kubernetes enforces the rules here; a node autoscaler, as a client of the Kubernetes API, isn't able by itself to bypass the security rule.Message ID: @.***> …
I don't believe that is true at all. For example, the Spot.io Ocean Cluster Controller absolutely labels our nodes with the node-role
label for us on startup. The only rule here is that the node itself is unable to apply that label.
The label node-role is not reserved by Kubernetes (and as a cluster-private label, never would be).
There is no restriction on labelling nodes with private labels, only on adding registered public labels that belong to Kubernetes.
Message ID: @.***>
Should node autoscalers try to set unofficial labels in the Kubernetes namespace? It's possible to code up, but I (still) don't recommend it; in fact, I actively oppose making that change to Karpenter. See https://github.com/kubernetes-sigs/karpenter/issues/1046#issuecomment-2237753906
Message ID: @.***>
Should node autoscalers try to set unofficial labels in the Kubernetes namespace? It's possible to code up, but I (still) don't recommend it; in fact, I actively oppose making that change to Karpenter
I should fix my comment - I want to use the kubernetes.io/role=xxx
label - specifically defined here https://github.com/kubernetes/kubernetes/blob/v1.30.3/staging/src/k8s.io/kubectl/pkg/describe/interface.go#L33-L34 as:
// NodeLabelRole specifies the role of a node
NodeLabelRole = "kubernetes.io/role"
Can you elaborate on why you oppose Karpenter managing this label? I can't find anything in the Kubernetes code or documentation that indicates we should't set this label.
I think we should ask the OpenShift folks to consider not using the wrong label, and leave it at that. OpenShift is entirely welcome to use OpenShift labels for node role, it just shouldn't bypass K8s processes for registering labels.
i think this is sound advice and something i have advocated for internally at red hat. i only brought up openshift in my issue as it pertains to my specific problem here. i am not attempting to suggest that we need to change kubernetes core behaviors based on my needs, i only added it to help with context.
i appreciate the extended discussions from @sftim and the explanation for the security-related part of this issue.
my confusion is more around karpenter attempting to block everything in the kuberenetes.io
prefixed labels, i think this is similar to what @diranged is also pointing out. on this point, i'm fine to accept the community's decision on what karpenter should or should not allow, i was mostly looking to have an open discussion about it.
Maybe this is a missing detail: Karpenter creates nodes that have labels by, AIUI, creating nodes that will run a kubelet that then self-labels.
We could enhance Karpenter to also use a different mechanism, setting labels directly on Nodes based on the owning NodeClaim. I don't see a good case for adding complexity here, but a case could be made.
Message ID: @.***>
creating nodes that will run a kubelet that then self-labels
@sftim,
So that is true - if you are using an amiFamily like Bottlerocket
where Karpenter creates/merges the userData... however, we actually were running in Custom
mode, where Karpenter is not mutating or managing the userData at all, and it still sets the labels on the nodes. Karpenter is (I think) pre-creating the Node
object and populating the labels that way.. which would allow it to set the kubernetes.io/role
label just fine.
Description
What problem are you trying to solve?
As an openshift user, i require new nodes to take the label
node-role.kubernetes.io/worker
. Although this label uses thekubernetes.io
prefix, it is not an official well-known label. Currently, karpenter does not allow NodePools to be created when a label with thekubernetes.io
prefix is used on a non-well-known label for new node creation, or when checking predicates.How important is this feature to you?
This is fairly important for openshift as the label is used frequently. It is possible to work around this, but for long term usage on openshift a permanent solution would be better.
This issue was discussed at the Feb. 26 2024 community meeting.