Open sbengo opened 6 years ago
To add more info, in our case, we want to send alerts in our production environment and, as I explained on the first comment, it is generating an alert for each event.
If the TICKscript is changed, those alerts persist on our monitoring system with the PreviousLevel
as it doesn't have received its OK
event.
@nathanielc , @desa , can you review it please?
Thanks, Greetings!
Hi,
We have been working with Kapacitor to generate alerts based on metrics threshold (simple ones) on:
SO: RHEL 7.4 Kapacitor: Kapacitor OSS 1.5.0 (git: HEAD 4f10efc41b4dcac070495cf95ba2c41cfcc2aa3a)
Overview
We have some TICKScripts that fires
N
events, based onworking cardinality
of thealert
node, so the N events can be changing his ownstate
based on the threshold.The problem seems to appear when we change the TICKScript and we reload the task, forcing the OK of the
N events
Actual behaviour
After reload the task with new thresholds to force the OK on the
N events
, only 1 event is fired to OK and the otherN-1 events
seems to be 'lost' and considered as OK, but no OK event is fired.Expected behaviour
After reload the task with new thresholds to force the OK on the
N events
, theN events
are fired to OK.Detailed case
To allow you to repro the case, I have written a TICKScript and a brief table with actions and events fired:
TICKSCRIPT
Actions and results
On the following table, it is shown the actions and the events results.
As it is shown, after forcing an OK on already
N CRIT events
, it only fires a single OK eventOK
CRIT: Series – cpu0/myhost CRIT: Series – cpu1/myhost CRIT: Series – cpu-total/myhost
OK
OK: Series - cpu0/myhost OK: Series – cpu1/myhost OK: Series – cpu-total/myhost
OK
CRIT: Series – cpu0/myhost CRIT: Series – cpu1/myhost CRIT: Series – cpu-total/myhost
NOOK
OK: Series – (RANDOM?)/myhost