aws / karpenter-provider-aws

Karpenter is a Kubernetes Node Autoscaler built for flexibility, performance, and simplicity.
https://karpenter.sh
Apache License 2.0
6.13k stars 846 forks source link

Karpenter support for Redhat Openshift 4.12.x #6024

Open sandeepkp1175 opened 2 months ago

sandeepkp1175 commented 2 months ago

Description

What problem are you trying to solve? We are trying to adopt Karpenter for Redhat Openshift version 4.12.27. Backend master and worker nodes are EC2 instances. But while doing so we are getting we are getting the below error:

{"level":"FATAL","time":"2024-04-11T15:27:17.626Z","logger":"controller","message":"unable to detect the cluster endpoint, failed to resolve cluster endpoint, AccessDeniedException: User: arn:aws:sts:::assumed-role//i- is not authorized to perform: eks:DescribeCluster on resource: arn:aws:eks:us-east-1::cluster/","commit":"17dd42b"}

How important is this feature to you?

We run a variety of workloads in the cluster and for each type of workloads we have to specify the machine type. By adopting Karpenter we will let Karpenter choose the node type during the autoscaling based on the workloads. Also it will reduce the turn around time of the pods awaiting to be scheduled by allocating enough compute.

jigisha620 commented 2 months ago

We try to discover the CLUSTER_ENDPOINT if the value is not set by doing the eks:DescribeCluster call. You can set the value for it such that it points to your API sever endpoint.

sandeepkp1175 commented 2 months ago

Hi @jigisha620 Thanks for figuring out the issue. I added the CLUSTER_ENDPOINT as the environment variable. After adding the value when the Karpenter pods try to come up they throw the below error. It looks like the application is looking for the AWS SQSQueue. Please see the error below.

` panic: fetching queue url, AWS.SimpleQueueService.NonExistentQueue: The specified queue does not exist.

goroutine 1 [running]: github.com/samber/lo.must({0x2656020, 0xc0005e90e0}, {0x0, 0x0, 0x0}) github.com/samber/lo@v1.39.0/errors.go:53 +0x1e9 github.com/samber/lo.Must... github.com/samber/lo@v1.39.0/errors.go:65 github.com/aws/karpenter-provider-aws/pkg/controllers.NewControllers({0x317b8f8, 0xc000770e10}, 0x7fa366877228?, {0x317f728, 0x48ef8e0}, {0x318b680?, 0xc000b56750}, {0x314e820?, 0xc0007ca990?}, 0xc0009e2f90, ...) github.com/aws/karpenter-provider-aws/pkg/controllers/controllers.go:60 +0x525 main.main() github.com/aws/karpenter-provider-aws/cmd/controller/main.go:55 +0x63e `

jigisha620 commented 2 months ago

You may want to set the value for INTERRUPTION_QUEUE. You can find more details here. Important thing to note here - Karpenter watches an SQS queue which receives critical events from AWS services which may affect your nodes. Karpenter requires that an SQS queue be provisioned and EventBridge rules and targets be added that forward interruption events from AWS services to the SQS queue. If you haven't created one already, you can follow the steps in the getting-started-guide. You may not need everything that is created as part of the cloudformation stack but can follow the steps mentioned there to create the queue.

sandeepkp1175 commented 2 months ago

Thank you @jigisha620 for the response. I'll give a try and let you know.

sandeepkp1175 commented 2 months ago

Hello @jigisha620

We still see the same error:

panic: fetching queue url, AWS.SimpleQueueService.NonExistentQueue: The specified queue does not exist. Please see the error below. We added the INTERRUPTION_QUEUE as in the environment variables of the Karpenter deployment, but still ending up in the queue existing error. We provided SQS:* access from the master and worker nodes roles, and this was specified in the SQS policy. Could you please advise if any other permissions are required? The Openshift cluster is not on ROSA. The backend is EC2 nodes, and control plane components are administrator managed.

goroutine 1 [running]: github.com/samber/lo.must({0x2656020, 0xc0001be9a0}, {0x0, 0x0, 0x0}) github.com/samber/lo@v1.39.0/errors.go:53 +0x1e9 github.com/samber/lo.Must... github.com/samber/lo@v1.39.0/errors.go:65 github.com/aws/karpenter-provider-aws/pkg/controllers.NewControllers({0x317b8f8, 0xc000712000}, 0x7fae3d8b3228?, {0x317f728, 0x48ef8e0}, {0x318b680?, 0xc00081cab0}, {0x314e820?, 0xc000010ff0?}, 0xc0004b6c40, ...) github.com/aws/karpenter-provider-aws/pkg/controllers/controllers.go:60 +0x525 main.main() github.com/aws/karpenter-provider-aws/cmd/controller/main.go:55 +0x63e

jigisha620 commented 2 months ago

Can you validate if the queue exists and it is in the correct region? Can you share the output of the kubectl describe pod <karpenter_pod>?

sandeepkp1175 commented 2 months ago

Hi @jigisha620

Please see below the describe pod output: ` Name: karpenter-bd58b9c96-pt5d6 Namespace: kube-system Priority: 2000000000 Priority Class Name: system-cluster-critical Service Account: karpenter Node: node-ip.ec2.internal/node-ip Start Time: Mon, 15 Apr 2024 14:43:18 -0500 Labels: app.kubernetes.io/instance=karpenter app.kubernetes.io/name=karpenter pod-template-hash=bd58b9c96 Annotations: cni.projectcalico.org/containerID: e27277da87db35e00056b649ef81c2b615ed18e4382f43b554e4a1b90d526434 cni.projectcalico.org/podIP: cni.projectcalico.org/podIPs: k8s.v1.cni.cncf.io/network-status: [{ "name": "k8s-pod-network", "interface": "eth0", "ips": [ "pod-ip" ], "mac": "2a:ef:d9:b9:a3:xx", "default": true, "dns": {} }] k8s.v1.cni.cncf.io/networks-status: [{ "name": "k8s-pod-network", "interface": "eth0", "ips": [ "pod-ip" ], "mac": "2a:ef:d9:b9:a3:xx", "default": true, "dns": {} }] Status: Running IP: pod-ip IPs: IP: pod-ip Controlled By: ReplicaSet/karpenter-bd58b9c96 Containers: controller: Container ID: cri-o://cb82be53be9e78246548971e1c111acc7bf44869535312c5b8c77759bd5bcbba Image: public.ecr.aws/karpenter/controller:0.35.4@sha256:27a73db80b78e523370bcca77418f6d2136eea10a99fc87d02d2df059fcf5fb7 Image ID: public.ecr.aws/karpenter/controller@sha256:27a73db80b78e523370bcca77418f6d2136eea10a99fc87d02d2df059fcf5fb7 Ports: 8000/TCP, 8081/TCP Host Ports: 0/TCP, 0/TCP State: Waiting Reason: CrashLoopBackOff Last State: Terminated Reason: Error Exit Code: 2 Started: Tue, 16 Apr 2024 10:12:55 -0500 Finished: Tue, 16 Apr 2024 10:13:08 -0500 Ready: False Restart Count: 224 Limits: cpu: 1 memory: 1Gi Requests: cpu: 1 memory: 1Gi Liveness: http-get http://:http/healthz delay=30s timeout=30s period=10s #success=1 #failure=3 Readiness: http-get http://:http/readyz delay=5s timeout=30s period=10s #success=1 #failure=3 Environment: KUBERNETES_MIN_VERSION: 1.19.0-0 KARPENTER_SERVICE: karpenter LOG_LEVEL: info METRICS_PORT: 8000 HEALTH_PROBE_PORT: 8081 SYSTEM_NAMESPACE: kube-system (v1:metadata.namespace) MEMORY_LIMIT: 1073741824 (limits.memory) FEATURE_GATES: Drift=true,SpotToSpotConsolidation=false BATCH_MAX_DURATION: 10s BATCH_IDLE_DURATION: 1s ASSUME_ROLE_DURATION: 15m CLUSTER_NAME: VM_MEMORY_OVERHEAD_PERCENT: 0.075 INTERRUPTION_QUEUE: arn:aws:sqs:us-east-1:: RESERVED_ENIS: 0 CLUSTER_ENDPOINT: Mounts: /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-22vv2 (ro) Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: kube-api-access-22vv2: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: DownwardAPI: true ConfigMapName: openshift-service-ca.crt ConfigMapOptional: QoS Class: Guaranteed Node-Selectors: kubernetes.io/os=linux Tolerations: CriticalAddonsOnly op=Exists node.kubernetes.io/memory-pressure:NoSchedule op=Exists node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Topology Spread Constraints: topology.kubernetes.io/zone:ScheduleAnyway when max skew 1 is exceeded for selector app.kubernetes.io/instance=karpenter,app.kubernetes.io/name=karpenter Events: Type Reason Age From Message


Warning Unhealthy 3h57m kubelet Readiness probe failed: Get "http://pod-ip:8081/readyz": read tcp node-ip:44196->pod-ip:8081: read: connection reset by peer Warning Unhealthy 82m kubelet Readiness probe failed: Get "http://pod-ip:8081/readyz": read tcp node-ip:36590->pod-ip:8081: read: connection reset by peer Normal Pulled 77m (x211 over 19h) kubelet Container image "public.ecr.aws/karpenter/controller:0.35.4@sha256:27a73db80b78e523370bcca77418f6d2136eea10a99fc87d02d2df059fcf5fb7" already present on machine Warning BackOff 2m8s (x5295 over 19h) kubelet Back-off restarting failed container

`

jonathan-innis commented 2 months ago

@sandeepkp1175 From this discussion, it's unclear to me whether you want to enable the interruption queue or not. If you do, then you need to create the SQS queue and set the INTERRUPTION_QUEUE environment variable to be equal to the queue name. If you don't want to enable it, then you need to unset this value when deploying Karpenter.

sandeepkp1175 commented 2 months ago

@jonathan-innis I have already created the queue and passed the name of the queue in the environment variable. Even after creating the queue we are still seeing that the queue is not being detected in the Karpenter pod error logs. @jigisha620 asked me to provide the pod logs which I've provided in the response. Attaching it again.

describe_output.txt

sandeepkp1175 commented 2 months ago

@jigisha620 / @jonathan-innis please let me know if you got a chance to look at the issue.

jonathan-innis commented 2 months ago

From looking at your configuation above, it looks like the interruption queue that you are specifying is a full ARN but Karpenter is just expecting the name of the interruption queue for the setting. I'll admit that this isn't clear in https://karpenter.sh/docs/reference/settings/#:~:text=health%20(default%20%3D%208081)-,INTERRUPTION_QUEUE,-%2D%2Dinterruption%2Dqueue so regardless, we should improve this description so it's more clear in our reference docs.

sandeepkp1175 commented 2 months ago

Hi @jonathan-innis @jigisha620

We are able to bring up the Karpenter pods by providing only the queue name instead of ARN in the INTERRUPTION_QUEUE environment variable. However the Karpenter application tries to call the pricing:GetProducts. Please see the below error.

{"level":"INFO","time":"2024-04-18T20:25:20.422Z","logger":"controller","message":"Starting workers","commit":"17dd42b","controller":"state.nodeclaim","controllerGroup":"karpenter.sh","controllerKind":"NodeClaim","worker count":10} {"level":"INFO","time":"2024-04-18T20:25:20.423Z","logger":"controller","message":"Starting workers","commit":"17dd42b","controller":"nodeclaim.lifecycle","controllerGroup":"karpenter.sh","controllerKind":"NodeClaim","worker count":1000} {"level":"INFO","time":"2024-04-18T20:25:20.427Z","logger":"controller","message":"Starting workers","commit":"17dd42b","controller":"nodeclaim.tagging","controllerGroup":"karpenter.sh","controllerKind":"NodeClaim","worker count":1} {"level":"INFO","time":"2024-04-18T20:25:20.433Z","logger":"controller","message":"Starting workers","commit":"17dd42b","controller":"nodeclaim.consistency","controllerGroup":"karpenter.sh","controllerKind":"NodeClaim","worker count":10} {"level":"INFO","time":"2024-04-18T20:25:20.433Z","logger":"controller","message":"Starting workers","commit":"17dd42b","controller":"nodeclaim.termination","controllerGroup":"karpenter.sh","controllerKind":"NodeClaim","worker count":100} {"level":"INFO","time":"2024-04-18T20:25:20.434Z","logger":"controller","message":"Starting workers","commit":"17dd42b","controller":"nodeclaim.disruption","controllerGroup":"karpenter.sh","controllerKind":"NodeClaim","worker count":10} {"level":"INFO","time":"2024-04-18T20:25:20.434Z","logger":"controller","message":"Starting workers","commit":"17dd42b","controller":"nodepool.counter","controllerGroup":"karpenter.sh","controllerKind":"NodePool","worker count":10} {"level":"INFO","time":"2024-04-18T20:25:20.445Z","logger":"controller","message":"Starting workers","commit":"17dd42b","controller":"nodeclass","controllerGroup":"karpenter.k8s.aws","controllerKind":"EC2NodeClass","worker count":10} {"level":"INFO","time":"2024-04-18T20:25:20.458Z","logger":"controller","message":"Starting workers","commit":"17dd42b","controller":"nodepool.hash","controllerGroup":"karpenter.sh","controllerKind":"NodePool","worker count":10} {"level":"ERROR","time":"2024-04-18T20:25:21.094Z","logger":"controller.pricing","message":"retreiving on-demand pricing data, AccessDeniedException: User: arn:aws:sts::acct-id:assumed-role/worker-iam-role/instance-id is not authorized to perform: pricing:GetProducts because no identity-based policy allows the pricing:GetProducts action; AccessDeniedException: User: arn:aws:sts::acct-id:assumed-role/worker-iam-role/instance-id is not authorized to perform: pricing:GetProducts because no identity-based policy allows the pricing:GetProducts action","commit":"17dd42b"}

Our workers did not had the access to this policy before. After adding this policy, containing the required action we are now seeing the below error in the Karpenter pod.

{"level":"INFO","time":"2024-04-19T18:59:48.901Z","logger":"controller","message":"Starting workers","commit":"17dd42b","controller":"state.nodepool","controllerGroup":"karpenter.sh","controllerKind":"NodePool","worker count":10} {"level":"INFO","time":"2024-04-19T18:59:48.901Z","logger":"controller","message":"Starting workers","commit":"17dd42b","controller":"state.daemonset","controllerGroup":"apps","controllerKind":"DaemonSet","worker count":10} {"level":"ERROR","time":"2024-04-19T18:59:49.466Z","logger":"controller.pricing","message":"retreiving on-demand pricing data, AccessDeniedException: User: arn:aws:sts::aws-account-id:assumed-role/worker-iam-role/instance-id is not authorized to perform: pricing:GetProducts because no service control policy allows the pricing:GetProducts action; AccessDeniedException: User: arn:aws:sts::aws-account-id:assumed-role/worker-iam-role/instance-id is not authorized to perform: pricing:GetProducts because no service control policy allows the pricing:GetProducts action","commit":"17dd42b"}

We are planning to modify the scp. Could you please let us know if there are any other permissions we need to grant to have it running successfully? Thank you for the guidance.

jonathan-innis commented 2 months ago

@sandeepkp1175 Hard to know since permissions tend to be a trial-and-error game assuming that you have configured your role consistent with what's in the "Getting Started" guide. Try getting the SCP unblocked and then come back if you are still having issues getting Karpenter up and running with permissions.

sandeepkp1175 commented 2 months ago

Thank you @jonathan-innis . I've requested for the SCP unblock and let you know if I hit another blocker.

github-actions[bot] commented 1 month ago

This issue has been inactive for 14 days. StaleBot will close this stale issue after 14 more days of inactivity.

illyaMs commented 1 month ago

Can we please consider reopening this one?

At my current company we'd really appreciate the Openshift support for Karpenter, as we feel this combination of the two great projects might be fruitful and may close some of our crucial infrastructure needs.

mihaigalos commented 1 month ago

I'm also for reopening this issue.

Openshift offers support for kind: "ClusterAutoscaler" since at least v4.9.

Would it not be easire for Karpenter to dynamically generate that resource and apply it via the apiserver?

sandeepkp1175 commented 1 month ago

@jigisha620 @jonathan-innis

We are seeing the below error while trying to run the Karpenter in Openshift cluster. Could you please advise? Also please help in re-opening the GitHub issue.

{"level":"ERROR","time":"2024-05-28T20:49:19.963Z","logger":"controller.disruption","message":"listing instance types for default, resolving node class, EC2NodeClass.karpenter.k8s.aws \"default\" not found","commit":"17dd42b"} {"level":"ERROR","time":"2024-05-28T20:49:25.665Z","logger":"controller.provisioner","message":"skipping, unable to resolve instance types, resolving node class, EC2NodeClass.karpenter.k8s.aws \"default\" not found","commit":"17dd42b","nodepool":"default"} {"level":"INFO","time":"2024-05-28T20:49:25.812Z","logger":"controller.provisioner","message":"found provisionable pod(s)","commit":"17dd42b","pods":"podname/model-d4dnm","duration":"162.449739ms"} {"level":"ERROR","time":"2024-05-28T20:49:25.812Z","logger":"controller.provisioner","message":"Could not schedule pod, all available instance types exceed limits for nodepool: \"default\"","commit":"17dd42b","pod":"podname/model-d4dnm"} {"level":"ERROR","time":"2024-05-28T20:49:27.384Z","logger":"controller.provisioner","message":"skipping, unable to resolve instance types, resolving node class, EC2NodeClass.karpenter.k8s.aws \"default\" not found","commit":"17dd42b","nodepool":"default"}

sandeepkp1175 commented 1 month ago

Also we want to know what will happen to the existing workloads running on the nodes provisioned by Openshift Autoscaler?

engedaam commented 1 month ago

@sandeepkp1175 Do you have an EC2NodeClasses defined in the cluster?

sandeepkp1175 commented 1 month ago

Hello @engedaam

We have the following EC2NodeClasses defined in the cluster.

` apiVersion: karpenter.k8s.aws/v1beta1 kind: EC2NodeClass metadata: annotations: karpenter.k8s.aws/ec2nodeclass-hash: '12705821474257842030' karpenter.k8s.aws/ec2nodeclass-hash-version: v1 creationTimestamp: '2024-05-29T21:02:26Z' finalizers:

sandeepkp1175 commented 1 month ago

NodePool yaml is below. Also kindly let us know what will happen to the existing nodes once Karpenter starts. Will they continue to run?

`apiVersion: karpenter.sh/v1beta1 kind: NodePool metadata: annotations: karpenter.sh/nodepool-hash: '10536667676019604368' karpenter.sh/nodepool-hash-version: v1

engedaam commented 1 month ago

@sandeepkp1175 Are you able to see a status section for the EC2NodeClass? It does not seem to be included

sandeepkp1175 commented 1 month ago

Yes we are able to see the status section. Please see below

`status: amis:

sandeepkp1175 commented 3 weeks ago

Hi @jonathan-innis @jigisha620 @engedaam

Any thoughts on the above issue?

sandeepkp1175 commented 2 weeks ago

@jonathan-innis @jigisha620 Any findings on this issue?

sandeepkp1175 commented 2 weeks ago
github-actions[bot] commented 4 days ago

This issue has been inactive for 14 days. StaleBot will close this stale issue after 14 more days of inactivity.

sandeepkp1175 commented 4 days ago

Can we remove the stale lifecycle tag?