Open trm109 opened 3 weeks ago
/summarize awslabs amazon-eks-ami 1145
Here's a summary of the GitHub issue reply chain:
Issue: Kubernetes nodes on EKS clusters become unresponsive and lose readiness when applications consume too much memory, causing extended downtime. This doesn't occur on GKE clusters.
Key points:
Proposed solutions and workarounds:
Conclusion: While there's no official fix yet, AWS acknowledges the need to revise kubeReserved values. Users are advised to implement workarounds such as increasing reserved memory, properly configuring pod resources, and considering custom node logic. The issue remains open, with ongoing discussions about the best approach to resolve it permanently.
/summarize awslabs amazon-eks-ami 990
Here's a summary of the GitHub issue discussion:
Issue: Nodes occasionally fail to boot and get stuck in NotReady state.
Key details:
Likely root cause:
Potential factors:
Proposed solutions:
Next steps:
The issue remains open pending further investigation and fixes.
/summarize abc=1 xyz 120
/summarize abc=1 xyz 120
/summarize +owner testA +repo testB +issue 10
/summarize awslabs aws-shell 1
/summarize awslabs aws-shell 1
/summarize awslabs aws-shell 1
/summarize awslabs aws-shell 1
/summarize awslabs aws-shell 1
/summarize awslabs aws-shell 1
/summarize awslabs aws-shell 1
/summarize awslabs aws-shell 1
/summarize awslabs aws-shell 1
/summarize awslabs aws-shell 1
/summarize awslabs aws-shell 1
/summarize awslabs aws-shell 1
/summarize awslabs aws-shell 1
/summarize awslabs aws-shell 1
/summarize awslabs aws-shell 1
What happened:
What you expected to happen:
How to reproduce it (as minimally and precisely as possible):
Anything else we need to know?:
Environment:
aws eks describe-cluster --name <name> --query cluster.platformVersion
):aws eks describe-cluster --name <name> --query cluster.version
):uname -a
):cat /etc/eks/release
on a node):