Yelp / elastalert

Easy & Flexible Alerting With ElasticSearch
https://elastalert.readthedocs.org
Apache License 2.0
7.99k stars 1.74k forks source link

use_terms_query does not support multiple query_key #3026

Open rudolphkurt opened 3 years ago

rudolphkurt commented 3 years ago

The following rule works and will send me an alert if the combination of host_name and log.file.path has more than 500 matches.

name: Logstash failure to send to Kafka
type: frequency
num_events: 500
index: logstash-logs*
realert:
    hours: 1
timeframe:
    minutes: 15
doc_type: logstash-log
query_key:
- host_name
- log.file.path
filter:
- term:
    logEvent.message: Sending batch to Kafka failed. Will retry after a delay.
# only need the hostname and log.file.path fields for my alert
alert:
- slack

The problem is that this query may match a very large number of documents and I've seen this put significant memory pressure on the container I'm running Elastalert in. I have tried to change the rule to the following use_terms_query rule

name: Logstash Send to Kafka Failed
type: frequency
num_events: 500
index: logstash-logs*
realert:
    hours: 1
timeframe:
    minutes: 15
use_terms_query: true
doc_type: logstash-log
terms_size: 400
query_key:
- host_name
- log.file.path
filter:
- term:
    logEvent.message: Sending batch to Kafka failed. Will retry after a delay.
# only need the hostname and log.file.path fields for my alert
alert:
- slack

However, this rule is not working. I grabbed the es_query from the trace file by running elastalert with the --es_debug_trace option and I can see that the field for the terms aggregation is just a comma delimited list from my query_key

curl -H 'Content-Type: application/json' -XGET 'http://localhost:9200/logstash-logs*/logstash-log/_search?pretty&ignore_unavailable=true&size=0' -d '{
  "aggs": {
    "counts": {
      "terms": {
        "field": "host_name,log.file.path",
        "min_doc_count": 1,
        "size": 400
      }
    }
  },
  "query": {
    "bool": {
      "filter": {
        "bool": {
          "must": [
            {
              "range": {
                "@timestamp": {
                  "gt": "2020-11-04T20:06:25.191085Z",
                  "lte": "2020-11-04T20:06:32.735320Z"
                }
              }
            },
            {
              "term": {
                "logEvent.message": "Sending batch to Kafka failed. Will retry after a delay."
              }
            }
          ]
        }
      }
    }
  }
}'

The field "host_name,log.file.path" does not exist. I would have expected/hoped that the the following nested aggregation would have been created.

"aggs": {
    "counts": {
      "terms": {
        "field": "host_name",
        "size": 400
      },
      "aggs": {
        "counts2": {
          "terms": {
            "field": "log.file.path",
            "size": 400
          }
        }
      }
    }
  }

The use_terms_query works for me if I only specify one term in the query_key (like host_name), however I'd like to alert on the combination of fields without having to resort to concatenating the fields into a single composite field for aggregation purposes. Is the lack of support for multiple query_keys in the use_term_query a bug or intended? Is there are work around to my issue without having to download all the matching documents like in the first query?

JeffAshton commented 2 years ago

Sparking discussion in ElastAlert2 thread

https://github.com/jertel/elastalert2/discussions/699