Closed visit1985 closed 2 months ago
Thanks for your patience, continuing to track this investigation as part of https://github.com/aws/aws-app-mesh-roadmap/issues/489
Re-opening this issue since the fix hasn't been released yet. As an update, we experienced delays in our release and are currently working on a new release which will include this fix. Will share an update as soon as we have one.
Summary
We have workloads running on EKS Fargate with an aws-appmesh-envoy sidecar injected by AWS App Mesh Controller. The appnet agent process (PID 1) has a nofile soft limit of 65535, while the forked envoy process has a nofile soft limit of 1024 only.
This imposes a limits of max. ~480 possible TCP connections, since a file handle is created for each ingress/egress. Reaching the limit causes the envoy process to crash and being restarted by the appnet agent (#181), which causes outage.
Steps to Reproduce
Please refer to support case 170713370901828 for this.
Are you currently working around this issue?
We are unable to workaround this issue, because the appnet agent seems to be closed source.