Closed sameerjain1995 closed 3 weeks ago
I'm seeing the same symptoms using a nearly identical setup.
Something I noticed looking through system logs on the node is that cluster networking seems to come up quite slowly for the cluster with SELinux enabled.
Interesting. Do you see the same behavior on other distros, like the EKS Ubuntu AMI?
I think this is still happening. We recently moved our images to the CIS hardened ones (which are based off the eks-optimized image on AL2) https://aws.amazon.com/marketplace/pp/prodview-kfjezhuetoa3e
Our cloud-init times went from 20s on eks-optimized to 300s on the CIS image. We even tried warm-pools initializing the EBS volume, but it only went down to 250s. One of the things that we noticed is that the eks-optimized images have SELinux disabled and the CIS images have it enabled.
Any ideas on how to speed up startup? most of the time is lost on the /etc/eks/bootstrap.sh
script, specifically at the end of the script, where it is creating files and symlinking, etc... (Only pasting the last part of the logs since the first part took less than 1 minute, and this part took 4)
2024-04-26T01:10:00+0000 [private-dns-name] INFO: retrieved PrivateDnsName: ip-10-30-52-31.ec2.internal
+ echo ip-10-30-52-31.ec2.internal
+ exit 0
‘/etc/eks/containerd/containerd-config.toml’ -> ‘/etc/containerd/config.toml’
‘/etc/eks/containerd/sandbox-image.service’ -> ‘/etc/systemd/system/sandbox-image.service’
Created symlink from /etc/systemd/system/multi-user.target.wants/containerd.service to /usr/lib/systemd/system/containerd.service.
Created symlink from /etc/systemd/system/multi-user.target.wants/sandbox-image.service to /etc/systemd/system/sandbox-image.service.
‘/etc/eks/containerd/kubelet-containerd.service’ -> ‘/etc/systemd/system/kubelet.service’
Created symlink from /etc/systemd/system/multi-user.target.wants/kubelet.service to /etc/systemd/system/kubelet.service.
2024-04-26T01:14:10+0000 [eks-bootstrap] INFO: complete!
I think this should be improved by https://github.com/awslabs/amazon-eks-ami/pull/1773, there's some kind of issue with sudo
during cloud-init
when SELinux is enabled, it causes sudo
to take >20 seconds per invocation.
Going to close this, #1773 seemed to do the trick 👍
Environment: production
aws eks describe-cluster --name <name> --query cluster.platformVersion
): 1.24aws eks describe-cluster --name <name> --query cluster.version
): 1.24uname -a
): 5.10.165-143.735.amzn2.x86_64When we enable selinux on our custom ami worker node takes 6-7 mins to get in ready state and pod to start scheduling on it, but when we disable selinux then ami is ready in 1-2 min. Could you please let me know why selinux is causing this delay and how can we reduce this delay