influxdata / kapacitor

Open source framework for processing, monitoring, and alerting on time series data
MIT License
2.32k stars 493 forks source link

Alert not triggered multiple times #2357

Open SanderWegter opened 4 years ago

SanderWegter commented 4 years ago

Hi,

Kapacitor OSS 1.5.3 Influxdb-Version: 1.8.0

I'm collecting SNMP TRAPs via Telegraf and want to send out an alert whenever a trap is sent which has certain oid's/values, but for now I'm just trying to get it to send an alert on any trap received.

These traps arrive from different devices/manufacturers

However, I'm only getting an alert sent on the first trap and none on any subsequent traps. I've tried both batch and stream with the same results. Data does show up in the log.

stream:

var data = stream
    |from()
        .measurement('snmp_trap')

data
    |log()
        .level('DEBUG')

data
    |alert()
        .id('{{ .TaskName }}')
        .message('TRAP {{ .ID }}')
        .warn(lambda: 1 == 1)
        .details('''
            {{ .ID }}<br>
            {{ .Time }}<br>
            {{ .Fields }}<br>
            {{ .Tags }}
        ''')
        .email()
        .slack()

batch

var traps = batch
    |query('SELECT * FROM "telegraf"."autogen"."snmp_trap"')
        .period(1m)
        .every(1m)
        .groupBy(*)
    |log()
    |alert()
        .id('{{ .TaskName }}')
        .warn(lambda: 1 == 1)
        .message('TRAP {{ .ID }}')
        .details('''
            {{ .ID }}<br>
            {{ .Time }}<br>
            {{ .Fields }}<br>
            {{ .Tags }}
        ''')
        .email()
    |log()

I have tried adding .Time to the id to sort of randomize the id, but that didn't work. Is there any way to randomize the ID or another way always send out alerts no matter previous state?

jaw0608 commented 4 years ago

This is what the stateChangesOnly property is used for: (https://docs.influxdata.com/kapacitor/v1.5/nodes/alert_node/#statechangesonly)

From my experience (haven't had it confirmed though), this only applies to info, warn, and crit states. If an alert is consistently in the OK level, it does not send another alert.

SanderWegter commented 4 years ago

@jaw0608 Thanks for your reply!

I've tried different settings of .stateChangesOnly (with, without, global etc) but none seem to make a difference.

In the mean time, I've made a small python script which listens for forwarded traps by using

data
    |httpPost('http://monitoring:5000/trap')

This then creates a (readable) table and sends an email. It's not pretty but it does what I need for now.

bogski87 commented 4 years ago

Could you paste your entrie TICK script please and a sample of the data in your snmp measurements?

Edit: specifically, the stream version. Personally i don't use batch for alerting purposes. Batch is more suited to down sampling your data