Open joe-alford opened 2 years ago
I just ran into this, and the related documentation is poor. The alertmanager_url
field must still be a valid URI, so prepending any scheme to the hostname appears to allow at least this validatoin code and subsequent DNS lookup to function as expected, e.g.:
alertmanager_url: dns://_http-web._tcp.kube-prometheues-stack-kub-alertmanager.kube-prometheus-stack.svc.cluster.local
enable_alertmanager_discovery: true
I note that the tests use the format http://_http._tcp.alertmanager.default.svc.cluster.local/alertmanager
which suggests that an URL with path/port components might work, however if you include a port number in the URI DNS resolution breaks, so I'm not sure this is a bug or whether only the host component is used from the input, and the scheme/path/etc are dropped.
Hi! This issue has been automatically marked as stale because it has not had any activity in the past 30 days.
We use a stalebot among other tools to help manage the state of issues in this project. A stalebot can be very useful in closing issues in a number of cases; the most common is closing issues or PRs where the original reporter has not responded.
Stalebots are also emotionless and cruel and can close issues which are still very relevant.
If this issue is important to you, please add a comment to keep it open. More importantly, please add a thumbs-up to the original issue entry.
We regularly sort for closed issues which have a stale
label sorted by thumbs up.
We may also:
revivable
if we think it's a valid issue but isn't something we are likely
to prioritize in the future (the issue will still remain closed).keepalive
label to silence the stalebot if the issue is very common/popular/important.We are doing our best to respond, organize, and prioritize all issues but it can be a challenging task, our sincere apologies if you find yourself at the mercy of the stalebot.
Here is working configuration for discovery
:
rulerConfig:
alertmanager_url: http://_http-web._tcp.alertmanager-operated.monitoring.svc.cluster.local
enable_api: true
enable_alertmanager_discovery: true
enable_alertmanager_v2: true
Port will be taken from SRV
record. I get alerting working with such setup.
I initially posted this in the
cortex
repo, but they have declined ownership.Describe the bug Using Kubernetes: when making use of the
enable_alertmanager_discovery
(-ruler.alertmanager-discovery
) flag, we are seeing the following line of code get hit erroneously:https://github.com/cortexproject/cortex/blob/cd786078a220ca0e6f9bcd510ed8170e457bc2f8/pkg/ruler/notifier.go#L110
This only happens if the URL is in the 'correct' format. If we pass in an 'invalid' URL, then the code works as expected. With the URL in a SRV DNS format, the URL is treated as an empty string.
To Reproduce Steps to reproduce the behavior: Deploy the HelmRelease file below and you will get the below.
With the URL in the correct format, then the following error is generated, but only on the read pod. This is https://github.com/cortexproject/cortex/blob/master/pkg/ruler/notifier.go#L110
which gives this error:
For additional context, with the URL in the 'wrong' format, we get the below error:
which gives the following error for the loki-read pod
Workaround
Build out a list of alertmanager targets manually with the following:
Expected behavior The URL is parsed as provided, and is not treated as an empty string.
Environment:
Additional Context Helm Release (full file included, but is at
alertmanager_url
andenable_alertmanager_discovery
):