Closed nebm-aws closed 1 year ago
We're aware of this limitation. As you point out, reserved CPU's is intended for very specific circumstances; and kubelet/system cgroups are a more appropriate default. The bootstrap script doesn't currently make any attempt to prevent incompatible kubelet options; but feel free to open a PR.
@cartermckinnon I have made the changes in the bootstrap.sh file in my forked branch: you can see the differences here: https://github.com/awslabs/amazon-eks-ami/compare/master...raghs-aws:amazon-eks-ami:master
Shall I submit a PR to the main branch for approval.
Change doesn't look unreasonable to me, feel free to open a PR
Thanks @cartermckinnon . created the PR https://github.com/awslabs/amazon-eks-ami/pull/1405
Resolved by #1405 .
What happened: Kubelet fails to start with --reserved-cpus argument due to conflict with system-reserved-cgroups being always specified in the kubelet configuration file.
What you expected to happen: Allow use of reserved-cpus argument since that option is required for telco NFV workloads. Per k8s official documentation:
This option is specifically designed for Telco/NFV use cases where uncontrolled interrupts/timers may impact the workload performance. you can use this option to define the explicit cpuset for the system/kubernetes daemons as well as the interrupts/timers, so the rest CPUs on the system can be used exclusively for workloads, with less impact from uncontrolled interrupts/timers.
How to reproduce it (as minimally and precisely as possible): Launch self managed EKS worker node group using latest EKS AMI (post Jan 2023) which sets systemReservedCgroup and kubeReservedCgroup in the kubelet configuration file by default. Add --reserved-cpus flag to BootstrapArguments for the node group. Kubelet will fail to start.
Anything else we need to know?: These are the lines in the configuration file that break reserved-cpus from being used: https://github.com/awslabs/amazon-eks-ami/blob/v20230127/files/bootstrap.sh#L504 https://github.com/awslabs/amazon-eks-ami/blob/v20230127/files/bootstrap.sh#L505
And this is the Commit that added them: https://github.com/awslabs/amazon-eks-ami/pull/1051/files
Environment:
aws eks describe-cluster --name <name> --query cluster.platformVersion
): Any version using containerd rather than docker for runtime.