grafana / mimir

Grafana Mimir provides horizontally scalable, highly available, multi-tenant, long-term storage for Prometheus.
https://grafana.com/oss/mimir/
GNU Affero General Public License v3.0
4.14k stars 532 forks source link

Ingester zonal disruptions #9908

Open nullren opened 5 hours ago

nullren commented 5 hours ago

Is your feature request related to a problem? Please describe.

When deploying Mimir to K8s, some Pod Disruption Budgets (PDBs) are created for some pod types (distributors, ingesters, etc), however, they tend to be too restrictive—I think something like allowing only 1 disruption.

Anyway, because metrics are replicated across zones, there isn't a clear way to define a PDB that allows for more disruptions safely.

Describe the solution you'd like

It would be nice if there was some way to have a "high level PDB" where zones can be disrupted. A "zone" would be "healthy" or "up" if all pods in that zone are healthy/up. So, a disrupted zone would be one where at least 1 pod is unhealthy.

So, what that might enable is something like having a "ZDB" where you have rule for a majority of zones to be available/undisrupted. This would allow you to disrupt a single zone (eg, all pods in that zone). This would speed up draining k8s nodes since you can safely disrupt 1/3 total pods which is really important/helpful when running many pods.

This might be accomplished via some sort of controller/operator.

For example, we have a cluster with 420 ingester pods—having the PDB where only 1 pod means at a maximum, we can only drain 1 k8s node at a time when this could be done much more quickly (and safely).

Describe alternatives you've considered

This might be something we'll have to create ourselves because (ironically) it's very disruptive.

nullren commented 5 hours ago

conceptually this could definitely be something that exists in kubernetes directly because the pattern of "allowing zonal disruptions" is not unique to mimir. eg, an elasticsearch cluster that has documents replicated across "zones" would benefit from this same controller...

nullren commented 4 hours ago

perhaps this is something the https://github.com/grafana/rollout-operator could manage?