Open Aaron199 opened 2 years ago
Could you fill in the other topics;
What you expected to happen:
How to reproduce it (as minimally and precisely as possible):
Full logs to relevant components:
Anything else we need to know:
? I'm not sure what you want to achieve at the moment. Thanks!
commented
sorry about that , i already update comment
ty! :)
I think I have the same issue:
containers:
- args:
- rule
- --data-dir=/thanos/data
- --rule-file=/etc/thanos/rules/*/*.yaml
- --query=dnssrv+_http._tcp.xks-query-frontend.monitor.svc.cluster.local
- --alertmanagers.url=http://alertmanager.monitor.svc.cluster.local:9093
- --remote-write.config-file=/tmp/config/rw-config.yaml
Using: quay.io/thanos/thanos:v0.25.2 For now I will lower the number of replicas to 1.
Hello 👋 Looks like there was no activity on this issue for the last two months.
Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗
If there will be no activity in the next two weeks, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind
command if you wish to be reminded at some point in future.
I'm investigating a similar situation. Looking at the ruler page, specifically the --alertmanagers.url
section, the behaviour described in this issue may be expected.
Alertmanager replica URLs to push firing alerts. Ruler claims success if push to at least one alertmanager from discovered succeeds
I would have thought the alertmanager cluster would reconcile the alerts amongst the replicas, but I'm guessing not.
@Aaron199 Did you make the alert manager service headless ? i.e. ClusterIP: None
.
I suspect that this is due to how the routing algorithm of ClusterIP
services works (you could change its type to LoadBalancer
or by other means for ex with Istio), it always select the same pod (as long as available). And a DNS lookup with the service name only returns one DNS entry.
For Thanos' DNS lookup to work, it needs the IPs of the pods directly. In that way it selects each pod (and not only one pod as explained above).
To make a service headless, simply set ClusterIP: None
with a type of ClusterIP
, see the doc.
Explanation of headless services with DNS lookup comparison here.
The trick is to configure dns service discovery in the alertmanager.url like "dnssrv+_http-web._tcp.alertmanager-operated:9093".
Thanos, Prometheus and Golang version used: thanos:v0.24.0 prometheus:v2.26.0
Object Storage Provider: hawei OBS What happened: i use the config "--alertmanagers.url=dns+http://alertmanager-main.monitoring:9093", "alertmanager-main" has 3 pods, but only one alertmanager has alerts, it doesn't like prometheus alerts in any alertmanagers
What you expected to happen: thanos ruler should like prometheus , alers fill in every alertmanager replicas.
How to reproduce it (as minimally and precisely as possible): just use that configs can reproduce it
Full logs to relevant components: i can't find any logs about that
Anything else we need to know: