observatorium / thanos-receive-controller

Kubernetes controller to automatically configure Thanos receive hashrings
Apache License 2.0
96 stars 46 forks source link

Support updating the hashring during scaledown/disruptions #107

Closed philipgough closed 1 year ago

philipgough commented 1 year ago

This change allows the user, behind a flag, to provide an actual real world view of the replicas that exist in an operable state within the hashring.

A previous comment warned about the consequences of adjusting the hashring during scale down events. However, this view only makes sense and works under the assumption that the disruption is temporary or unintended. We believe there are some benefits to supports this behaviour:

philipgough commented 1 year ago

Thanks @matej-g - I'll fix up based on your suggestions but I also think the concept of disruptions (not distributions, thanks autocomplete) is a known and well document concept in Kubernetes. See https://kubernetes.io/docs/concepts/workloads/pods/disruptions/ for example. PodDisruptionBudgets then are named accordingly because they respond to the respective budget.

So I think in the end we have three things to reason about:

  1. Scale ups
  2. Voluntary/Involuntary disruptions
  3. Scale downs (we could, for simplicity, reason about this as a voluntary disruption)

That is why I named the flag as I did, because it supports the removal of replicas irregardless of cause.

matej-g commented 1 year ago

Thanks for the explanation @PhilipGough, I stand corrected đŸ™‡, first time I'm learning about this. It still feels though we're really operating on pod status rather than disruption (we don't know what is going on with pod(s) and whether this is (in) voluntary disruption, since we only know pod's status). But since the concept does exist and is well understood, my point is even less important.