Open philipgough opened 1 year ago
Just to add I don't think this is really going to cause any problems. It is more of a reminder to myself to take a look at the reconciliation loop and to ensure we are handling errors correctly. But the reason for Pod annotation is to ensure that the receivers pick up the latest changes to the hashring as promptly as possible. Therefore a conflict should not concern us since we can assume what was intended has already happened, albeit by another mechanism.
My observation is this happens when Pods are in a transitioning state, and the pod transitions states during c.waitForPod so the annotation fails. This also only happens during statefulset replica count change (not if a pod terminates/restarts) as the controller does not respond to terminating pods.
You are correct in that the targeted pods receive the correct hashring (by virtue that they are starting) so long as it's in a scale-up situation.
Rolled out
02aec09ce44b2f26ec9364469c2c6396f58702eb
and configured the bool flag--annotate-pods-on-change
I see the following log