[Fargate] [request]: Fargate CPU Scheduling issues

rjclarke7 commented 2 years ago

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment

Tell us about your request When scheduling deployments on EKS Fargate exclusively, pods will sometimes fail to be created. This seems to be because the Fargate scheduler bases its capacity on the 'request' parameter instead of the 'limit', and when the limit is larger than request, this bug is reproduced.

Can Fargate be configured such that it allocates enough capacity for the 'limit' parameter of deployments instead of the 'request' when one is present?

Which service(s) is this request for? Fargate with EKS

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard? Error: failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error setting cgroup config for procHooks process: failed to write "30000": write /sys/fs/cgroup/cpu,cpuacct/kubepods/burstable/podfc67fe40-da5c-4dbc-8989-7be9ac4b5d55/operator/cpu.cfs_quota_us: invalid argument: unknown

The above error appears in logs (With variations in the value that 'failed to write' and the pod identifier) when attempting to schedule pods on Fargate EKS. The issue is related to CPU requests vs limits, and occurs whenever the 'limit' is higher than the 'request', as is the case on many packaged K8S deployments. While the manual workaround of setting the request to the same as the limit is viable, it is a manual step that doesn't seem like should be necessary.

Are you currently working around this issue? When setting the CPU and memory 'request' parameters in the Kubernetes deployment to match the 'limit' parameters, the pods are able to be scheduled and run. This was confirmed by AWS Support that this is the expected solution.

Additional context This behaviour was seen on multiple, unrelated EKS clusters.

rjclarke7 commented 2 years ago

I have updated this request based on feedback from AWS support, who confirmed the expected solution is to manually set the 'requested' CPU to match the 'limit'. Given many prepackaged K8S solutions make use of this feature, setting lower requested CPU, it feels like an unnecessary manual step that could be resolved by ensuring Fargate scheduling respects the 'limit' parameter for a CPU request when one is present, instead of the 'requested' value.

rjclarke7 commented 2 years ago

Additional feedback, this happens because the pods QoS scheduling must be Guaranteed when scheduled on Fargate. For this to happen there are several requirements, including that the CPU limit must equal the request. https://kubernetes.io/docs/tasks/configure-pod-container/quality-service-pod/

I'm still looking into why pods are required to be Guaranteed QoS when on Fargate, while I imagine the 1-1 pod/node mapping is the key factor, I don't understand why.

aws / containers-roadmap

[Fargate] [request]: Fargate CPU Scheduling issues #1864

Community Note