Closed Jcardoso96 closed 1 year ago
Hello @Jcardoso96 👋 I believe you have to set Alertmanager.spec.alertmanagerConfiguration.name
to match the name of your AlertmanagerConfig
. That might be what's missing you if I'm not mistaken.
Hi @JoaoBraveCoding, I forgot to put it but in the values.yaml used to deploy the prometheus operator I am setting alertmanagerConfigSelector.matchLabels
to the label defined in the AlertmanagerConfig.
prometheus-operator:
alertmanager:
verticalPodAutoscaler:
enabled: false
updatePolicy:
updateMode: "Off"
alertmanagerSpec:
alertmanagerConfigSelector:
matchLabels:
alertmanagerConfig: teamlabel
storage:
volumeClaimTemplate:
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 10Gi
Doing it with the name directly unfortunately would not work in our use case because we need to merge different configs when we are installing prometheus. The idea is to have different teams each with their own prometheus instance, where they have a base config common across all teams but they can also set up different configs specific to their team. Thus we need to merge the base config and the configs related to a team (using the appropriate label) for their prometheus instance.
I see, then if I'm not mistaken you also have to set alertmanagerConfigNamespaceSelector
to {}
to select all the namespaces, if it's nil
(which is the default if you don't configure it) the operator will look for AlertmanagerConfig
in the other namespaces.
Did not seem to work unfortunately : (
I might be wrong but I think the operator is able to find the AlertmanagerConfig
. This is because if I change the alertmanagerConfig
label to something different than the alertmanagerConfigSelector
in Alertmanager
then I don't have issues with the configuration.
It's only when the labels match and the operator finds the configuration that I then get the error.
Could I possibly have some yaml error in that configuration I posted or does it look okay?
Sorry, @Jcardoso96 I miss understood the problem 🤦 If we look at this logline
level=warn ts=2023-02-08T23:13:41.482175554Z caller=operator.go:1003 component=alertmanageroperator msg="skipping alertmanagerconfig" error="unable to get secret \"\": resource name may not be empty" alertmanagerconfig=monitoring/config-example namespace=monitoring alertmanager=prometheus-operator-alertmanager
it seems that you don't provide a secret name under routingKey
and if I'm not mistaken you need to do it, otherwise the operator would not be able to fetch that key.
That seemed to fix it, obrigado João!
What did you do? I wanted to start using AlertmanagerConfig and the alertmanagerConfigSelector/matchLabels option in Alertmanager to setup the configuration of the latter.
Created AlertmanagerConfig resource and expected Prometheus Operator to merge this configuration with the default one. However, this does not happen and prometheus operator logs state that no secret has been found, as if it is trying to use a secret to get the alertmanager configuration.
I explored two different avenues to debug: either the operator was still trying to deploy a secret and ignoring the alertmanagerConfig crd, or the config itself was invalid and the operator ignored it and tried to deploy a configuration from a non-existing secret as default.
However I can't seem to make it work. Has anybody else had a similar issue?
Environment
image: quay.io/prometheus-operator/prometheus-operator:v0.62.0
Kubernetes cluster kind:
kind cluster kind v0.17.0 go1.19.3 darwin/arm64
Manifests: