Closed DavidNix closed 10 months ago
You know what, sometimes I think it's the ScheduledVolumeSnapshot taking down the pod. If there's been a problem for a while, ScheduledVolumeSnapshot is pending. As soon as the min number of pods are ready, it quickly deletes one to take the snapshot.
It's gotten better, but I still see instances where > 1 pod will be deleted when only 1 should be at a time.
I think this happens more on sentries where we've disabled readiness probes. But I've seen it once on deployment where readiness probes were active.
I have yet to find a way to duplicate the issue reliably.