Open parkedwards opened 2 years ago
Hi @parkedwards, by default the managed Alertmanager doesn't do anything or interfere with self-deployed Alertmanagers. So simply by not configuring it, it is essentially "disabled". If you want to go a step further, you could try scaling the MAM statefulset to 0 replicas, but this may only work if you are running GMP unmanaged.
For our reference, could you share why you need to disable the managed Alertmanager? Thanks
@damemi makes sense, we'll leave the MAM set un-configured
For our reference, could you share why you need to disable the managed Alertmanager? Thanks
Sure thing, so we're opting to self deploy our Alertmanager instances, but otherwise still used the managed GMP components (collectors, rule-evaluator, etc.). Ideally, we wouldn't be running any other Alertmanager deployment or statefulset (eg. the Managed ones), just to conserve on resource usage and reduce confusion for anyone else on the team
It would be nice to be able to disable managed AlertManager to avoid confusion.
We just upgraded to v0.5.0, and spent half a hour figuring out a way to disable the managed alertmanager.
Why?
How?
We have to deploy GMP via the manifests, the addon installation isn't flexible enough for our needs (we need istio-sidecars, and we don't allow manual configuration through kubectl: everything should be code). Since there isn't a Helm chart for GMP, we have to use kustomize. It's fairly straightforward to configure GMP to suit your needs via kustomizations:
# ./kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- https://raw.githubusercontent.com/GoogleCloudPlatform/prometheus-engine/v0.5.0/manifests/setup.yaml
- https://raw.githubusercontent.com/GoogleCloudPlatform/prometheus-engine/v0.5.0/manifests/operator.yaml
patchesStrategicMerge:
- delete-google-managed-alertmanager.yaml
patches:
- target:
name: config
kind: OperatorConfig
patch: |-
# Connect GMP with our self-managed alertmanager
- op: add
path: /rules
value:
alerting:
alertmanagers:
- name: alertmanager
namespace: monitoring
port: 9093
- target:
name: gmp-system
kind: Namespace
patch: |-
# Add Istio sidecars to the GMP so Kiali graphs make sense
- op: add
path: /metadata/labels
value:
istio-injection: enabled
---
# ./delete-google-managed-alertmanager.yaml
$patch: delete
apiVersion: v1
kind: Service
metadata:
namespace: gmp-system
name: alertmanager
---
$patch: delete
apiVersion: v1
kind: Secret
metadata:
namespace: gmp-system
name: alertmanager
---
$patch: delete
apiVersion: apps/v1
kind: StatefulSet
metadata:
namespace: gmp-system
name: alertmanager
Nice, and your workaround makes sense! Also you are welcome to use newer version of operator (e.g. gke.gcr.io/prometheus-engine/operator:v0.6.3-gke.0
and v0.6.3-rc.0
tag).
We will think with the team if there is a way for us to have easier way of disabling AM for your use cases.
Hi @bwplotka, Do we have any updates on this topic?
No discussion yet, sorry for lag.
From my understanding this feature only applies to managed GMP.
One way of solving it is a new field e.g. "OperatorConfig.ManagedAlertmanagerSpec.Disabled = true that would change Alertmanager replica to 0 from 1.
The addition work on our side is to fix alerting for this case (for managed GMP we have solid SLOs).
Before we prioritise this we would love to know what's the confusion argument that was meant here. Sounds like it's the only reason for this feature. The confusion comes from "alertmanager" named pod running in some system namespace (it's filtered out by default though) when listing pods. Or is it some other source of confusion?
Note: @bernot-dev works on this feature (automatic disabling of AM and rule-eval if no configuration is used for those) 🤗
Solution implemented in #691. Rule-evaluator and alertmanager will scale to zero when there are no Rules set up in the cluster.
Hi team, Thanks for your work, but I have to say that the solution implemented doesn't have a lot of sense, to me. The purpose of the issue, if I'm not wrong, is to use a self-deployed alertmanager, so there will be rules. With this solution, we will have the same problem. In my opinion the solution should be just especify a way to enable or disable, independently of the amount of rules.
To be more specific, it scales the GMP Rule-evaluator Deployment and Alertmanager StatefulSet to zero if none of these custom resources exist:
monitoring.googleapis.com/ClusterRules
monitoring.googleapis.com/GlobalRules
monitoring.googleapis.com/Rules
The primary goal of #691 was saving resources when the user does not need those pods running. If a user wants their own self-deployed Alertmanager, the GMP Alertmanager should not interfere unless it is also using our specific custom resources.
well to @robmonct's point, which mirrors our use case -- we're using all of the Managed Prometheus components (rule evaluator, collector, etc.) which includes usage of the Rule CRDs. We just want to use our own Alertmanager instance
+1 for a proper solution. The managed alertmanager isn't fitting for us anymore, as we require an AlertmanagerConfig
pendant for our different application teams (but also due to #685). Nevertheless, we want to use the remaining GMP parts like collectors, rule-evaluator, etc.
The managed alertmanager is just eating cluster resources. While it's not a lot, I'd prefer to not have useless pods in our clusters.
Hey @m3adow - I'll re-open this issue so we can discuss as a team how we want to address and prioritize this.
hello there - if we opt to self-deploy Alertmanager with GMP, is there a way to disable the automatically created alertmanager deployment / service?
https://cloud.google.com/stackdriver/docs/managed-prometheus/rules-managed#self-deployed_alertmanager