Closed Leif160519 closed 1 year ago
有的时候告警发一遍,大部分的时候发两遍
通常存在几种可能 1.alertmanager路由重复发送 2.本身alertmanager发来的聚合消息中就包含了多条重复告警
目前是这样的情况: 1.alertmanager里显示的告警有一条 2.飞书发出的重复告警时间间隔15秒钟 3.监控metrics用了victoriametrics做了持久化
有几个疑问和猜测: 1.prometheusalert配置中配置了默认飞书机器人地址fsurl=xxx的话,是否每个告警都会发送到xxx 2.prometheus和victoriametrics会不会同时向alertmanager发送告警信息
贴上配置
/etc/prometheus/prometheus.yml
global:
scrape_interval: 15s
scrape_timeout: 10s
evaluation_interval: 1m
external_labels:
region: Tencent
remote_write:
- url: http://10.200.0.188:8428/api/v1/write
rule_files:
- /etc/prometheus/rules/*.rules
alerting:
alertmanagers:
- static_configs:
- targets:
- 127.0.0.1:9093
......
/etc/prometheus/alertmanager.yml
global:
resolve_timeout: 5m
templates:
- /etc/prometheus/conf.d/email.tmpl
inhibit_rules:
- source_match_re: # 严重抑制警告
severity: critical
target_match_re:
severity: warning
equal: [ all, alertname ]
route:
group_by: ['alertname','job']
group_wait: 3m
group_interval: 5m
repeat_interval: 24h
receiver: 'xxx'
routes:
- receiver: "@linux/feishu"
match_re: { channels: "(.*)?@linux/feishu([:/;].*)?" }
continue: true
receivers:
- name: 'xxx'
- name: '@linux/feishu' # feishu非P0告警
webhook_configs:
- url: 'http://127.0.0.1:8080/prometheusalert?type=fs&tpl=prometheus-fs&fsurl=https://open.feishu.cn/open-apis/bot/v2/hook/201a64e9-770d-428c-xxx-xxxxxxx'
send_resolved: true
...
这几天貌似正常了,先关闭了
同一个告警,飞书收到好几遍 查看prometheusalert的日志也是打印很多次 经常遇到,一直没能解决,希望能给点思路