Closed mohabusama closed 6 years ago
This is just another case of the auto-scaler not honoring and prioritizing nodes for termination.
If downscaling via cluster auto scaler would prefer nodes without statefulsets or nodes not impacting pod disruption budgets this could be easily prevented.
IMHO this is working by design in Kubernetes, Pods can be terminated any time. Systems that have no operator, that takes ownership of failover to replica similar to https://github.com/zalando-incubator/postgres-operator are a bug itself if these can not run with more than 2 replicas and need a single write master. Therefore closing this, because it has to be solved by the redis application owner
Since
zmon-redis
is running as a single pod, there are chances where the pod gets re-scheduled (autoscaling) and in turn leading to some unexpected behavior:We have various alternatives afaik: