prometheus-community / helm-charts

Prometheus community Helm charts
Apache License 2.0
5.08k stars 5.01k forks source link

[kube-prometheus-stack] Scaling down alertmanager or prometheus statefulsets don't stop the services #4214

Open jvergara-runbuggy opened 9 months ago

jvergara-runbuggy commented 9 months ago

Describe the bug a clear and concise description of what the bug is.

I tried to scale down alertmanager and prometheus statefulset to zero to stop the services but the pods are terminated and a new is created.

What's your helm version?

version.BuildInfo{Version:"v3.6.3", GitCommit:"d506314abfb5d21419df8c7e7e68012379db2354", GitTreeState:"clean", GoVersion:"go1.16.5"}

What's your kubectl version?

Client Version: version.Info{Major:"1", Minor:"23+", GitVersion:"v1.23.17-eks-0a21954"

Which chart?

kube-prometheus-stack

What's the chart version?

17.2.2

What happened?

pods keep restarting when I scale down to zero both alertmanager and prometheus

What you expected to happen?

Pods to be terminated until I decide to scale up the replicas to 1

How to reproduce it?

kubectl scale --replicas=0 sts prometheus-prom-kube-prometheus-stack-prometheus -n

Enter the changed values of values.yaml?

No response

Enter the command that you execute and failing/misfunctioning.

kubectl scale --replicas=0 sts prometheus-prom-kube-prometheus-stack-prometheus -n

Anything else we need to know?

No response

zeritti commented 8 months ago

I tried to scale down alertmanager and prometheus statefulset to zero to stop the services but the pods are terminated and a new is created.

Both statefulsets are being created and managed by the operator based on the corresponding prometheus and alertmanager CRs. The operator will always reset the changes that deviate from those in the CRs.

If you wish to remove the pods, you can upgrade your release whilst setting alertmanager.alertmanagerSpec.replicas and prometheus.prometheusSpec.replicas to 0.

You can control the statefulsets directly but only after having paused the CRs by upgrading your release setting prometheus.prometheusSpec.paused and alertmanager.alertmanagerSpec.paused:

If set to true all actions on the underlying managed objects are not going to be performed, except for delete actions.

Ref. CRD prometheus, alertmanager