prometheus / alertmanager

Prometheus Alertmanager
https://prometheus.io
Apache License 2.0
6.68k stars 2.17k forks source link

alertmanager reliably crashes on every boot #4130

Open calestyo opened 2 days ago

calestyo commented 2 days ago

What did you do?

Every time when booting, alertmanager errors out (but works when starting later).

Environment

Debian bookworm, Linux 6.1.0-27-amd64 x86_64

So it seems there are a number of errors involved here:

Not really sure what it means by "private IP" (or why it should need any), any normal UNIX daemon typically binds to the wildcard address if no specific bind addresses are given.
Also, the service is pulled in by multi-user.target and at that time any networking (including the statically configured global IPs) are long up.

Anyway, if I start the daemon a bit later, it works just fine.

Cheers, Chris,

grobinson-grafana commented 2 days ago

Hi!

Alertmanager is crashing because it cannot get the information it needs to initialize the cluster for high availability mode. The error means it cannot find a private IP address for the system, which it advertises to other Alertmanagers in the same cluster.

If your system does not have a private IP address, and/or you do not need high availability mode, you can disable it with the following argument:

--cluster.listen-address=""

Error sending alert" err="Post \"http://localhost:9093/api/v2/alerts\": dial tcp [::1]:9093: connect: connection refused

Your Prometheus can't send alerts to Alertmanager because it's crash looping.