Open Jasper-Ben opened 11 months ago
These exceptions should cause an uncordon on the affected nodes:
https://github.com/deinstapel/eks-rolling-update/blob/master/eksrollup/lib/k8s.py#L195-L198
@martin31821 please look into it. Thx :slightly_smiling_face:
When an eks-rolling-update job failes, the previous cluster state is not automatically recovered, instead requiring manual intervention:
Most notably, auto-scaling will be scaled to 0. This is an issue as our workloads (especially CI) heavily depend on functioning auto-scaling.