Open sergii-rolskii opened 7 years ago
I did the test: set parameters:
query_key:
- carrier
filter:
- query:
query_string:
query: "action: * AND carrier: * AND result: *"
match_bucket_filter:
- query_string:
query: "action:AUTO AND result:ERROR AND carrier:operator_1"
max_percentage: 10
I want to get the percentage in a chain: operator_1:AUTO:ERROR I get an email and see 57,7%... Then I create a visualization in kibana and calculate same for the same period:
For 30 min I have ~ 23.000 events in a chain operator_1:AUTO:ERROR I see 1673 events = (7.24%)
Where am I wrong?
Your code looks ok. I'm not quite sure what the reason is. There's two things you can do to help debug.
One is to add --es_debug_trace file
to record the full query being made to elasticsearch, which might be helpful for manual testing or verifying.
The second is to add some debug lines to get the event counts:
--- a/elastalert/ruletypes.py
+++ b/elastalert/ruletypes.py
@@ -1030,6 +1030,8 @@ class PercentageMatchRule(BaseAggregationRule):
def check_matches(self, timestamp, query_key, aggregation_data):
match_bucket_count = aggregation_data['percentage_match_aggs']['buckets']['match_bucket']['doc_count']
other_bucket_count = aggregation_data['percentage_match_aggs']['buckets']['_other_']['doc_count']
+ print "Match bucket:", match_bucket_count
+ print "Other bucket:", other_bucket_count
There very well could be a bug in this code. I'm not regularly using it myself so I haven't done extensive testing.
So, my problem partially solved "percentage_match" works perfectly with one value in each field: Example:
match_bucket_filter:
- query_string:
query: "result:ERROR AND action:AUTO AND carrier:operator_1"
but if I set more value: Example:
match_bucket_filter:
- query_string:
query: "result:(ERROR SYSTEM_ERROR) AND action:AUTO AND carrier:(operator_1 operator_2)"
I get one email and see a summary percentage in all fields. Example:
(carrier:operator_1 + result:ERROR + result:SYSTEM_ERROR + action:AUTO) + (carrier:operator_2 + result:ERROR + result:SYSTEM_ERROR + action:AUTO)
= 75%
(on email I see summary percent)
It sums up all together, but I need 4 different chain:
because I have 2 different "carrier" and 2 different "result"
If I set:
query_key:
- carrier
I get 2 different email, for each "carrier": Example:
(carrier:operator_1 + result:ERROR + result:SYSTEM_ERROR + action:AUTO) = 28% (I see on fist email) and (carrier:operator_2 + result:ERROR + result:SYSTEM_ERROR + action:AUTO) = 47% (I see on second email)
This is not what I need, right? Okay, add "result" to query_key in order to have 4 chains:
query_key:
- carrier
- result
but this not work... If I set second key in query_key in the logs I see count query:
from 2017-08-07 18:54 EEST to 2017-08-07 19:04 EEST: 295 query hits (0 already seen), 0 matches, 0 alerts sent
but not get any email...
Why for "percentage_match" I can't use 2 or more key for query_key?
@Qmando any Ideas? Can I even use "query_key" with "percentage_match"? Because if I use it, I get wrong data.
I have the same issue, if i use 2 values for query_key, the rule don't match. :/
Same type of issue... If I use multiple query_key values I get
Traceback (most recent call last):, File "/usr/lib/python2.7/site-packages/elastalert/elastalert.py", line 1153, in run_all_rules, num_matches = self.run_rule(rule, endtime, self.starttime), File "/usr/lib/python2.7/site-packages/elastalert/elastalert.py", line 854, in run_rule, self.run_query(rule, tmp_endtime, endtime), File "/usr/lib/python2.7/site-packages/elastalert/elastalert.py", line 618, in run_query, rule_inst.add_aggregation_data(data), File "/usr/lib/python2.7/site-packages/elastalert/ruletypes.py", line 1002, in add_aggregation_data, self.unwrap_term_buckets(timestamp, payload_data['bucket_aggs']['buckets']), File "/usr/lib/python2.7/site-packages/elastalert/ruletypes.py", line 1016, in unwrap_term_buckets, self.check_matches(timestamp, term_data['key'], term_data), File "/usr/lib/python2.7/site-packages/elastalert/ruletypes.py", line 1141, in check_matches, match_bucket_count = aggregation_data['percentage_match_aggs']['buckets']['match_bucket']['doc_count'], KeyError: 'percentage_match_aggs'
I'll investigate this.
Same type of issue... If I use multiple query_key values I get
Traceback (most recent call last):, File "/usr/lib/python2.7/site-packages/elastalert/elastalert.py", line 1153, in run_all_rules, num_matches = self.run_rule(rule, endtime, self.starttime), File "/usr/lib/python2.7/site-packages/elastalert/elastalert.py", line 854, in run_rule, self.run_query(rule, tmp_endtime, endtime), File "/usr/lib/python2.7/site-packages/elastalert/elastalert.py", line 618, in run_query, rule_inst.add_aggregation_data(data), File "/usr/lib/python2.7/site-packages/elastalert/ruletypes.py", line 1002, in add_aggregation_data, self.unwrap_term_buckets(timestamp, payload_data['bucket_aggs']['buckets']), File "/usr/lib/python2.7/site-packages/elastalert/ruletypes.py", line 1016, in unwrap_term_buckets, self.check_matches(timestamp, term_data['key'], term_data), File "/usr/lib/python2.7/site-packages/elastalert/ruletypes.py", line 1141, in check_matches, match_bucket_count = aggregation_data['percentage_match_aggs']['buckets']['match_bucket']['doc_count'], KeyError: 'percentage_match_aggs'
I have the same issue
Seems to be still happening, @Qmando did you have a chance to look in to this?
Fixed with pull request #2133 !
I need to setup percentage_match for many fields, so I have fields:
"carrier" may have 2 different value: operator_1 OR operator_2 (values of this field are contained in one index but in different documents) "result" may have 4 or more values: GOOD, STATUS_OK, REJECT, GREEN, ERROR, SYSTEM_ERROR (one of these values is necessarily present in each document) "action" may have 2 value: AUTO OR MANUAL (one of these values is necessarily present in each document)
I need alarm when _carrier:operator1 AND action:(AUTO OR MANUAL) AND _result:(ERROR OR SYSTEMERROR) more than 30% from all events for 30min and I need same alarm but _carrier:operator2
My_config:
On the logs I see:
email not sent, but on kibana, I see an event on the same time...
if I set query_key: carrier (only) on the log I see:
and I get email, but I don't know that is %
For some reason, 3 parameters do not work for query_key...
please correct me if I am wrong