aws / amazon-ecs-ami

Packer recipes for building the official ECS-optimized Amazon Linux AMIs
https://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs-optimized_AMI.html
Apache License 2.0
193 stars 46 forks source link

Consider matching EKS AMI's sysctl config #251

Open isker opened 4 months ago

isker commented 4 months ago

Summary

ECS AMIs do not configure some kernel sysctl parameters that probably ought to be.

Description

When migrating a workload from EKS to ECS on EC2, I was vexed by an obscure memory allocation error in NodeJS that was ultimately caused by the kernel's default sysctl value for vm.max_map_count being too low. The default value is 65530.

The EKS configuration is a good reference for these: https://github.com/awslabs/amazon-eks-ami/blob/b15c2b75eb95dfd4db18b446a9dcd923ca23a861/templates/al2023/runtime/rootfs/etc/sysctl.conf

It configures vm.max_map_count=524288, which we were unknowingly relying upon. We had to spend a lot of time and effort diagnosing, and are now setting it in our ASGs' launch templates.

The default kernel value dates from at least 19 years ago (more or less; they fudged it down a tiny bit later). It is probably not well-tuned for modern hardware.

Googling around for vm.max_map_count, NodeJS is not the only software that can be foiled by this value. This is probably why it is configured in EKS in the first place.

Here's where EKS bumped it up: https://github.com/awslabs/amazon-eks-ami/pull/589.

isker commented 4 months ago

Kernel documentation for this parameter: https://www.kernel.org/doc/html/latest/admin-guide/sysctl/vm.html?highlight=max_map_count#max-map-count

I ultimately found out about it by reading the malloc manpage: https://www.man7.org/linux/man-pages/man3/malloc.3.html