aws / containers-roadmap

This is the public roadmap for AWS container services (ECS, ECR, Fargate, and EKS).
https://aws.amazon.com/about-aws/whats-new/containers/
Other
5.22k stars 321 forks source link

[EKS] [request]: Cloudwatch Observability addon's default tolerations are too broad #2462

Open voidlily opened 3 weeks ago

voidlily commented 3 weeks ago

Community Note

Tell us about your request The default tolerations for the EKS Cloudwatch Observability addon are too broad:

  tolerations:
  - operator: Exists

I can understand these tolerations for the daemonset pods for cloudwatch agents and fluent-bit, but this toleration also applies to the controller-manager pod, which is managed by a deployment with replica count 1.

Which service(s) is this request for? EKS

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard? The default tolerations mean that while draining a node, the scheduler can sometimes schedule the pod on the exact node it's draining from.

Are you currently working around this issue? Currently working around by overriding the annotations, or by just waiting long enough the node is hard terminated.

Additional context Anything else we should know?

Attachments If you think you might have additional information that you'd like to include via an attachment, please do - we'll take a look. (Remember to remove any personally-identifiable information.)