There was a bug introduced at some point, that once a deletion of a node was determined, if there was an error while moving replicas off of the node, then the operator was likely to delete the pod without retrying to move the replicas.
I've restructured the logic around fetching cluster state information, so that its very clear where the information is coming from when making these choices.
There was a bug introduced at some point, that once a deletion of a node was determined, if there was an error while moving replicas off of the node, then the operator was likely to delete the pod without retrying to move the replicas.
I've restructured the logic around fetching cluster state information, so that its very clear where the information is coming from when making these choices.