observatorium / thanos-receive-controller

Kubernetes controller to automatically configure Thanos receive hashrings
Apache License 2.0
96 stars 46 forks source link

Issues updating Pod annotation during reconcilliation #112

Open philipgough opened 1 year ago

philipgough commented 1 year ago

Rolled out 02aec09ce44b2f26ec9364469c2c6396f58702eb and configured the bool flag --annotate-pods-on-change I see the following log

level=error caller=main.go:744 ts=2023-03-30T14:55:18.653695011Z msg="failed to update pod" err="Operation cannot be fulfilled on pods \"observatorium-thanos-receive-default-0\": the object has been modified; please apply your changes to the latest version and try again"
philipgough commented 1 year ago

Just to add I don't think this is really going to cause any problems. It is more of a reminder to myself to take a look at the reconciliation loop and to ensure we are handling errors correctly. But the reason for Pod annotation is to ensure that the receivers pick up the latest changes to the hashring as promptly as possible. Therefore a conflict should not concern us since we can assume what was intended has already happened, albeit by another mechanism.

tekicode commented 1 year ago

My observation is this happens when Pods are in a transitioning state, and the pod transitions states during c.waitForPod so the annotation fails. This also only happens during statefulset replica count change (not if a pod terminates/restarts) as the controller does not respond to terminating pods.

You are correct in that the targeted pods receive the correct hashring (by virtue that they are starting) so long as it's in a scale-up situation.