eksctl-io / eksctl

The official CLI for Amazon EKS
https://eksctl.io
Other
4.93k stars 1.41k forks source link

node-role labeling fails #4007

Open hryamzik opened 3 years ago

hryamzik commented 3 years ago

I'm trying to apply node labels using the following configuration:

  - name: test
    instanceType: m5.large
    minSize: 1
    maxSize: 2
    desiredCapacity: 2
    labels:
      role: test
      node-role.kubernetes.io/test: ok

According to #580 and #582 using node-role subdomain should the way to apply labels, however I still get a domain error:

AWS::EKS::Nodegroup/ManagedNodeGroup: CREATE_FAILED – "Label cannot contain reserved prefixes [kubernetes.io/, k8s.io/, eks.amazonaws.com/] (Service: AmazonEKS; Status Code: 400; Error Code: InvalidParameterException; Request ID: *** Proxy: null)"

Here's eksctl info:

eksctl version: 0.57.0
kubectl version: v1.21.2
OS: darwin
Callisto13 commented 3 years ago

Thanks for asking @hryamzik .

I cannot find that code in eksctl anymore, so it looks like it was removed. I will look into why, and if it happened by accident I will put it back 👍

AnthonyPoschen commented 3 years ago

same issue also exists with

eksctl version: 0.56.0
kubectl version: v1.21.2
OS: darwin
AnthonyPoschen commented 3 years ago

probably has something to do with https://github.com/weaveworks/eksctl/blob/69c1a78f12017f47bcb247f388e9f9442f7da11c/pkg/apis/eksctl.io/v1alpha5/validation.go#L477

eventually drilling down allowing https://github.com/kubernetes/kubernetes/blob/ea0764452222146c47ec826977f49d7001b0ea8c/staging/src/k8s.io/api/core/v1/well_known_labels.go#L51

not sure how this now allows setting the role.

Callisto13 commented 3 years ago

I was not able to dig into this further today, and I am away from now until the week after next. Free for anyone in the team or community to pick up 👍

gohmc commented 3 years ago

While eksctl allows node-role.kubernetes.io but the CF EKS module AWS::EKS::Nodegroup/ManagedNodeGroup rejected it. @TBBle has well explained the issue here.

TBBle commented 3 years ago

Indeed, this is the same core issue as #2363, but using Managed Node Groups, so the error is different.

Hence, the solution is blocked on some development work elsewhere; Managed Node Groups might be solved via the same thing that solves #2363, e.g., https://github.com/kubernetes/cloud-provider-aws/issues/110, but it's also possible that AWS could implement a Managed Node Groups-specific solution which would resolve this problem, but not help #2363.

Interestingly, there's no feature request ticket open on https://github.com/aws/containers-roadmap/ for this. The ticket I linked to before was actually for setting the node-role label to the node group name, rather than the label specified in the config, which is a more-specific use-case than what we have here.

Edit: I opened a feature request upstream for EKS Managed Node Groups to support this: https://github.com/aws/containers-roadmap/issues/1451

cPu1 commented 3 years ago

probably has something to do with https://github.com/weaveworks/eksctl/blob/69c1a78f12017f47bcb247f388e9f9442f7da11c/pkg/apis/eksctl.io/v1alpha5/validation.go#L477

eventually drilling down allowing https://github.com/kubernetes/kubernetes/blob/ea0764452222146c47ec826977f49d7001b0ea8c/staging/src/k8s.io/api/core/v1/well_known_labels.go#L51

not sure how this now allows setting the role.

As @gohmc and @TBBle have pointed out, the error is from the Managed Nodegroups API itself and not from a validation performed in eksctl. eksctl is blocked on the upstream API adding support for reserved prefixes in labels.

Callisto13 commented 3 years ago

dupe: https://github.com/weaveworks/eksctl/issues/1373

TBBle commented 3 years ago

I'd say this isn't a dupe, as that ticket was looking at unmanaged node-groups (i.e. closer to #2363), and its proposed solution approach solves a different problem: For both this ticket and #2363, the use-case is setting the node-role, which I think would not be considered "solved" if it only applies to nodes that were created while using eksctl to create the nodegroup, as future-created nodes, e.g. from the Cluster Autoscaler, would have an empty role still.

Skarlso commented 3 years ago

This issue in blocked state until AWS provides us with a reasonable solution to keep labels over autoscaler. Maybe through Karpenter.

TBBle commented 3 years ago

I expect that Karpenter, being a slot-in replacement for Cluster Autoscaler, will sadly have the same timing problem as a 'node label controller'.

Skarlso commented 3 years ago

It's not entirely slot-in. It has some extra features and a different algorithm. Are you sure it couldn't have the ability to reapply existing labels to nodes?

TBBle commented 2 years ago

It could, but it'd be doing it at the same time as any other Controller (except an Admission Controller), i.e. after the Node object has been persisted, and racing with the kube-scheduler to apply labels before any other Controller tries to make decisions about them.

So it can do it, but so could Cluster Autoscaler, and so could a dedicated Controller. They all have the same chance of success, AFAIK. (Unless somehow Karpenter is creating the Node resource and then kubelet only updates it, but I didn't see any suggestion of that in its docs, and I'm not sure if k8s is robust against that anyway...)

The core issue is that in the Node admission workflow, the Node object is created by the kubelet, and after passing the Node Restriction Admission Controller, persisted and available to the scheduler. It's possible the Cloud Provider Manager can get involved and add labels to the Node before it's marked 'Ready', I'm not sure. I didn't see a place for it last time I looked, but opened https://github.com/kubernetes/cloud-provider-aws/issues/110 in case someone else could see what I couldn't.

Annoyingly, even a mutating Admission Controller (my other thought) might not work, because the Node Restriction Admission Controller would run after it, and might reject the labels even though kubelet itself didn't add them; I'm not sure if it can distinguish labels added by kubelet in its request from labels added to that request by a mutating Admission Controller.

Skarlso commented 2 years ago

@TBBle thanks for the explanation!

Reading about kubernetes things handling nodes and AWS doing its thing, I have no idea how this could ever be resolved other than a continuously checking/updating controller, which we won't do sadly. :/

TBBle commented 2 years ago

If all you're trying to do is apply the node-role label as something other than master, and you're using it only for informative output in kubectl get Nodes (and not nodeSelector, or metrics tagging for example), then a controller that reacts to new nodes and applies that label would be fine, because then it doesn't materially matter if someone sees the node in the kubectl output for the few seconds before the label hasn't been applied yet.

Otherwise, it's best to use your own label for such behaviour, which is what I've done in the past: used a mycompany.com/node-role label (and taint) to group nodes from multiple node groups under one of a few categories, e.g., backend, publicIP (for publicly reachable NodePort or HostPort Pods) and locust (to isolate the load-testing Pods from the load being tested).

I honestly think if they taught kubectl to also recognise a different label for the Node Role field, and rejected attempts to set that value to master, it'd be sufficiently secure for the use-case, and would resolve all this friction in the node-management community.

github-actions[bot] commented 1 year ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions[bot] commented 1 year ago

This issue was closed because it has been stalled for 5 days with no activity.

enver commented 1 year ago

Check out a simple workaround here: https://hub.docker.com/repository/docker/enver/label-nodes/general