apache / hertzbeat

Apache HertzBeat(incubating) is a real-time monitoring system with agentless, performance cluster, prometheus-compatible, custom monitoring and status page building capabilities.
https://hertzbeat.apache.org/
Apache License 2.0
5.76k stars 1k forks source link

[Question] <title> Do you feel that the warning is not correct? #1072

Open wtjperi2003 opened 1 year ago

wtjperi2003 commented 1 year ago

Question

比如redis的available类型,实际上我redis一直在线。但是会一下告警不可用,一下告警恢复。像个几十秒。所以告警中心里,一大堆的未处理告警,不仅redis,MySQL,clickhouse,springboot等都一样 1 2

hertzbeat commented 1 year ago

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


Question

For example, the available type of redis, in fact, my redis is always online. However, the alarm will be unavailable for a while, and the alarm will be restored for a while. Like tens of seconds. So in the alarm center, there are a lot of unprocessed alarms, not only redis, MySQL, clickhouse, springboot, etc. 1 2

tomsun28 commented 1 year ago

hi 大概率是网络抖动或者你的对端服务偶现导致的 这种情况可以把阈值配置里面的可用性阈值的触发次数跳高设置为3次或2次,这样就能避免

hertzbeat commented 1 year ago

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


hi, it is most likely caused by network jitter or occasional occurrences of your peer service. In this case, you can set the trigger times of the availability threshold in the threshold configuration to 3 or 2 times, which can be avoided.