influxdata / kapacitor

Open source framework for processing, monitoring, and alerting on time series data
MIT License
2.31k stars 492 forks source link

Alerts are not triggered for each group in a batch query #1875

Open kyoussef opened 6 years ago

kyoussef commented 6 years ago

I’m trying to monitor multiple freezers, reading their temperature. I’m grouping by multiple tags that uniquely identifies a freezer. The following is a sample of the script I’m trying.

var period = 15m batch |query(’’’ select “temperature” as value from device where “machineType” = ‘freezertype’ ‘’’) .period(period) .every(1m) .groupBy(‘mac’, ‘machineId’, ‘locationId’, ‘machineType’) |alert() .all() .id(‘Freezer Temperature Alert’) .message(‘Freezer Temperature is {{ .Level }} for the past ’ + string(period)) .crit(lambda: “value” > -14.5) .stateChangesOnly() .log(’/tmp/alerts.log’)

so if I have 2 groups (freezers), group A and group B. When group A go out of range, I get a critical alert properly, but after it if group B go out of range, I won’t get an alert.

Is this normal? Am I doing anything wrong?

bnjroos commented 6 years ago

You should create a unique id for each group. For example if you have the tag "group" with values A or B, you could proceed as followed:

.id('Group {{ index .Tags "group"}}')

which would generate two different alert ids.

kyoussef commented 6 years ago

Thanks @bnjroos for your feedback. however tried it and didn't work. I did the following change to the id, added to it the mac address of the device:

.id('Freezer Temperature Alert {{ index .Tags "mac"}}')

and still when a Critical error was generated for a certain mac address. {"id":"Freezer Temperature Alert FF-FF-FF-18-6E-VV","message":"Freezer Temperature is CRITICAL for the past 15","time":"2018-09-10T07:39:00Z","duration":0,"level":"CRITICAL","data":{"series":[{"name":"gotempdevice","tags":{"locationId":"90004","mac":"FF-FF-FF-18-6E-VV","machineType":"ztItem_0005"},"columns":["time","max_boundary","min_boundary","outliers","value"],"values":[["2018-09-10T07:39:00Z",7,-3,1,12]]}]},"previousLevel":"OK","recoverable":true}

The second mac address Critical error was not generated.

aliakseiz commented 4 years ago

Not sure, why it didn't work for you @kyoussef , but it solved the problem in my case:

var uids string         //var uids = '"device:uid"=\'01\' OR "device:uid"=\'02\''
var field string        //var field = '"a"'
var criteria lambda     //var criteria = lambda: "a" > 0
var occurrenceTime duration
var startFrom string        //should be specified in nanoseconds, e.g. 1598944177000000000

var message = '{}'

var data = batch
    |query('SELECT ' + field + ' FROM "db"."ret"."mes" WHERE ' + uids)
        .period(occurrenceTime)
        .every(5s)
        .groupBy('device:uid')

var trigger = data
    |alert()
        .all()
        .crit(criteria)
        .stateChangesOnly()
        .id('{{.TaskName}} {{ index .Tags "device:uid" }}')
        .idTag('alertID')
        .levelTag('level')
        .messageField('message')
        .durationField('duration')
    .post()
        .endpoint('backend')
            .header('Authorization', 'kapacitor')
            .header('Content-Type', 'application/json')
        .captureResponse()
        .timeout(10s)

Thanks @bnjroos !