observatorium / thanos-receive-controller

Kubernetes controller to automatically configure Thanos receive hashrings
Apache License 2.0
94 stars 43 forks source link

--allow-dynamic-scaling does not respond to pod disruptions #123

Open tekicode opened 12 months ago

tekicode commented 12 months ago

In the readme about --allow-dynamic-scaling:

By default, the controller does not react to voluntary/involuntary disruptions to receiver replicas in the StatefulSet. This flag allows the user to enable this behavior. When enabled, the controller will react to voluntary/involuntary disruptions to receiver replicas in the StatefulSet. When a Pod is marked for termination, the controller will remove it from the hashring and the replica essentially becomes a "router" for the hashring. When a Pod is deleted, the controller will remove it from the hashring. When a Pod becomes unready, the controller will remove it from the hashring. This behaviour can be considered for use alongside the Ketama hashing algorithm.

The two highlighted lines are incorrect, the controller does not have a podInformer subscribed to receive updates from pods associated with the hashring.

As such, the allow-dynamic-scaling flag only responds to changes in the replica count of the statefulset. This only happens if the statefulset is updated; that is separate from the health of pods.

I've explored adding a podInformer, updating the configmapInformer, and reworking the logic around how pods are chosen while keeping backwards compatibility.

But I've seen a lot of previous discussion/issues about this/related problems in the past. Is this seen as a problem? (it is to me) If so, what opinions do others have on how the controller should behave in this situation?

christopherzli commented 8 months ago

I am also looking into this and wonder if there is any follow up?

chit786 commented 3 months ago

+1 on this feature. We also encountered situation where the receiver waiting for WAL replay but still was not moved out of hashring, leading to writes being errored out with.

 msg="failed to handle request" err="get appender: TSDB not ready" 

I wonder if restartpolicy of pod could be utilised to mark it as "Terminating" and then the controller picks up from there ?