prometheus / alertmanager

Prometheus Alertmanager
https://prometheus.io
Apache License 2.0
6.68k stars 2.17k forks source link

Failed to resolve alertmanager-main-1.alertmanager-operated:9094 #3327

Open bruse-peng opened 1 year ago

bruse-peng commented 1 year ago

when i install prometheus oprator the altermager has error msg=refresh result=failure addr=alertmanager-main-1.alertmanager-operated:9094 err="1 error occurred:\n\t* Failed to resolve alertmanager-main-1.alertmanager-operated:9094: lookup alertmanager-main-1.alertmanager-operated on 10.57.252.89:53: no such host\n\n"

What did you do?

What did you expect to see?

What did you see instead? Under which circumstances?

Environment

xxxx

apiVersion: monitoring.coreos.com/v1 kind: Alertmanager metadata: labels: alertmanager: main name: main namespace: monitoring spec: image: quay.io/prometheus/alertmanager:v0.22.0 replicas: 3 securityContext: fsGroup: 2000 runAsNonRoot: false runAsUser: 1000 serviceAccountName: alertmanager-main version: v0.22.0 configSecret: alertmanagerConfigSelector: matchLabels: alertmanagerConfig: sams-qa


* Prometheus configuration file:

xxxxxx


* Logs:

level=info ts=2023-04-14T08:25:22.826Z caller=coordinator.go:113 component=configuration msg="Loading configuration file" file=/etc/alertmanager/config_out/alertmanager.env.yaml level=info ts=2023-04-14T08:25:22.826Z caller=coordinator.go:126 component=configuration msg="Completed loading of configuration file" file=/etc/alertmanager/config_out/alertmanager.env.yaml level=info ts=2023-04-14T08:25:22.835Z caller=main.go:514 msg=Listening address=:9093 level=info ts=2023-04-14T08:25:22.835Z caller=tls_config.go:227 msg="TLS is disabled." http2=false level=info ts=2023-04-14T08:25:24.772Z caller=cluster.go:696 component=cluster msg="gossip not settled" polls=0 before=0 now=1 elapsed=2.00107379s level=info ts=2023-04-14T08:25:27.835Z caller=coordinator.go:113 component=configuration msg="Loading configuration file" file=/etc/alertmanager/config_out/alertmanager.env.yaml level=info ts=2023-04-14T08:25:27.836Z caller=coordinator.go:126 component=configuration msg="Completed loading of configuration file" file=/etc/alertmanager/config_out/alertmanager.env.yaml level=info ts=2023-04-14T08:25:32.774Z caller=cluster.go:688 component=cluster msg="gossip settled; proceeding" elapsed=10.003671261s level=warn ts=2023-04-14T08:25:37.778Z caller=cluster.go:461 component=cluster msg=refresh result=failure addr=alertmanager-main-0.alertmanager-operated:9094 err="1 error occurred:\n\t Failed to resolve alertmanager-main-0.alertmanager-operated:9094: lookup alertmanager-main-0.alertmanager-operated on 10.57.252.89:53: no such host\n\n" level=warn ts=2023-04-14T08:25:37.780Z caller=cluster.go:461 component=cluster msg=refresh result=failure addr=alertmanager-main-1.alertmanager-operated:9094 err="1 error occurred:\n\t Failed to resolve alertmanager-main-1.alertmanager-operated:9094: lookup alertmanager-main-1.alertmanager-operated on 10.57.252.89:53: no such host\n\n"

simonpasquier commented 1 year ago

The issue is either in the way the operator configures the statefulset or with your kubernetes DNS setup.

MrYu1019 commented 6 months ago

Hello, I have the same problem. Have you finally solved it? I checked the coredns problem, but it was useless. There was an exception, but altermanager-main kept reporting errors, and there was really no clue. image image