awslabs / amazon-eks-ami

Packer configuration for building a custom EKS AMI
https://awslabs.github.io/amazon-eks-ami/
MIT No Attribution
2.42k stars 1.14k forks source link

SELinux causing high boot time for eks worker node #1394

Closed sameerjain1995 closed 3 weeks ago

sameerjain1995 commented 1 year ago

Environment: production

nafhn commented 1 year ago

I'm seeing the same symptoms using a nearly identical setup.

Something I noticed looking through system logs on the node is that cluster networking seems to come up quite slowly for the cluster with SELinux enabled.

cartermckinnon commented 1 year ago

Interesting. Do you see the same behavior on other distros, like the EKS Ubuntu AMI?

wvidana commented 4 months ago

I think this is still happening. We recently moved our images to the CIS hardened ones (which are based off the eks-optimized image on AL2) https://aws.amazon.com/marketplace/pp/prodview-kfjezhuetoa3e

Our cloud-init times went from 20s on eks-optimized to 300s on the CIS image. We even tried warm-pools initializing the EBS volume, but it only went down to 250s. One of the things that we noticed is that the eks-optimized images have SELinux disabled and the CIS images have it enabled.

Any ideas on how to speed up startup? most of the time is lost on the /etc/eks/bootstrap.sh script, specifically at the end of the script, where it is creating files and symlinking, etc... (Only pasting the last part of the logs since the first part took less than 1 minute, and this part took 4)

2024-04-26T01:10:00+0000 [private-dns-name] INFO: retrieved PrivateDnsName: ip-10-30-52-31.ec2.internal
+ echo ip-10-30-52-31.ec2.internal
+ exit 0
‘/etc/eks/containerd/containerd-config.toml’ -> ‘/etc/containerd/config.toml’
‘/etc/eks/containerd/sandbox-image.service’ -> ‘/etc/systemd/system/sandbox-image.service’
Created symlink from /etc/systemd/system/multi-user.target.wants/containerd.service to /usr/lib/systemd/system/containerd.service.
Created symlink from /etc/systemd/system/multi-user.target.wants/sandbox-image.service to /etc/systemd/system/sandbox-image.service.
‘/etc/eks/containerd/kubelet-containerd.service’ -> ‘/etc/systemd/system/kubelet.service’
Created symlink from /etc/systemd/system/multi-user.target.wants/kubelet.service to /etc/systemd/system/kubelet.service.
2024-04-26T01:14:10+0000 [eks-bootstrap] INFO: complete!
cartermckinnon commented 4 months ago

I think this should be improved by https://github.com/awslabs/amazon-eks-ami/pull/1773, there's some kind of issue with sudo during cloud-init when SELinux is enabled, it causes sudo to take >20 seconds per invocation.

cartermckinnon commented 3 weeks ago

Going to close this, #1773 seemed to do the trick 👍