Yelp / elastalert

Easy & Flexible Alerting With ElasticSearch
https://elastalert.readthedocs.org
Apache License 2.0
7.99k stars 1.74k forks source link

Relaert is not working in frequency rule type #2036

Open skatakar opened 5 years ago

skatakar commented 5 years ago

I am using below parameters in frequency rule type but realert is not working on specified time duration. I have mentioned relaert with hours : 1 and queary_key : "Hostname" ie since realert is 1 hour I assume it should aggregate records based on hostname field for the hour (assume from 9:00 to 10:00) and as num_events : 1 hence it should raise only one alarm per hostname eventhough hostname have more events in that hr. it is working as expected. But if the event comes after one hr ie at 10:25, the alarm is not getting generated. currently I haven't configured any alert notification. I can see them are supressed in console.

Qmando commented 5 years ago

It doesn't make any sense to use an aggregation and realert with the same value with query_key and aggregation_key with the same value.

An aggregation is used to send multiple alerts together as one, however, if you have realert also set, there will be a maximum of one alert per aggregation. What your doing is effectively delaying the alerts by an hour. You should either remove realert if you want multiple alerts for one host sent together every hour, or, remove aggregation if you want 1 alert per hour per host.

When you say "But if the event comes after one hr ie at 10:25, the alarm is not getting generated.", are you talking about a different elastalert-test-rule run? Or are you actually running elastalert? That's not in your logs at all. If you get another hit within an hour of the New aggregation, it will be added to that aggregation. If it's after, the old aggregation will be sent and a new one created.

Qmando commented 5 years ago

1 hour aggregation means "Collect all alerts over the next hour, then send them all together as one"

1 hour realert means "Ignore all alerts for 1 hour after the first"

If you combine them, you can only get 1 alert in the aggregation, the rest are ignored. So it's not really aggregating anything, it's just delaying the ONE alert for an hour.

Qmando commented 5 years ago
aggregation:
  hours: 1
realert:
  hours: 1

Event at 12:00 -> Create a new aggregation for 1pm event at 12:15 -> Ignored due to realert event at 12:45 -> Ignored due to realert At 1:00 -> Alert sent with 1 match

aggregation:
  hours: 1
realert:
  hours: 0

Event at 12:00 -> Create a new aggregation for 1pm event at 12:15 -> Added to aggregation event at 12:45 -> Added to aggregation At 1:00 -> Alert sent with 3 matches

no aggregation
realert:
  hours: 1

Event at 12:00 -> Send alert immediately event at 12:15 -> Ignored due to realert event at 12:45 -> Ignored due to realert

Qmando commented 5 years ago

You don't need aggregation_key, but you should use query_key. With query_key: hostname, the realert only applies to the same hostname. If you didn't have query_key, it would stop alerts for ALL hostnames for 1 hour.

Qmando commented 5 years ago

IF you use python to generate docs, don't use "time" : datetime.datetime.now(), as it will generate a timestamp using the local time, but with no timezone, which elasticssearch will assume is UTC, causing it to be "In the past". Use datetime.utcnow().isoformat()+'Z'.

Qmando commented 5 years ago

Yeah.. that's the easiest way to get the local time. There's a option use_local_time but this only changes the timestamp in the default text strings, ie "There were more than 5 events between ..." or "Between and ... there were less than 1 event".

If you're trying to put a local timestamp in via alert_text, you can do what you suggest, or you'd need an enhancement to accomplish the same thing if you don't control the data. This is a feature lots of people request so maybe this will change at some point.

Qmando commented 5 years ago

You'll want something like https://github.com/Yelp/elastalert/issues/1385#issuecomment-423281701