Yelp / elastalert

Easy & Flexible Alerting With ElasticSearch
https://elastalert.readthedocs.org
Apache License 2.0
7.98k stars 1.73k forks source link

Elasticsearch 7.0.1 - Trying to create too many scroll contexts. Must be less than or equal to: [500] #2249

Open Cyb3rWard0g opened 5 years ago

Cyb3rWard0g commented 5 years ago

Good afternoon,

I upgraded my stack to 7.0.1 and I am using elastalert>=0.2.0b. When I run Elasticsearch and elastalert, it alls seems to be working fine. However, the moment I ingest data, and the rules start hitting the indices, I get the following messages:

ERROR:root:Error running query: TransportError(500, u'search_phase_execution_exception', u'Trying to create too many scroll contexts. Must be less than or equal to: [500]. This limit can be set by changing the [search.max_open_scroll_context] setting.')
WARNING:elasticsearch:GET http://helk-elasticsearch:9200/logs-*/_search?ignore_unavailable=true&_source_includes=%40timestamp%2C%2A&scroll=30s&size=10000 [status:500 request:0.016s]
ERROR:root:Error running query: TransportError(500, u'search_phase_execution_exception', u'Trying to create too many scroll contexts. Must be less than or equal to: [500]. This limit can be set by changing the [search.max_open_scroll_context] setting.')
WARNING:elasticsearch:GET http://helk-elasticsearch:9200/logs-endpoint-winevent-security-*/_search?ignore_unavailable=true&_source_includes=%40timestamp%2C%2A&scroll=30s&size=10000 [status:500 request:0.007s]
ERROR:root:Error running query: TransportError(500, u'search_phase_execution_exception', u'Trying to create too many scroll contexts. Must be less than or equal to: [500]. This limit can be set by changing the [search.max_open_scroll_context] setting.')
WARNING:elasticsearch:GET http://helk-elasticsearch:9200/logs-endpoint-winevent-sysmon-*/_search?ignore_unavailable=true&_source_includes=%40timestamp%2C%2A&scroll=30s&size=10000 [status:500 request:0.007s]
ERROR:root:Error running query: TransportError(500, u'search_phase_execution_exception', u'Trying to create too many scroll contexts. Must be less than or equal to: [500]. This limit can be set by changing the [search.max_open_scroll_context] setting.')
WARNING:elasticsearch:GET http://helk-elasticsearch:9200/logs-endpoint-winevent-security-*/_search?ignore_unavailable=true&_source_includes=%40timestamp%2C%2A&scroll=30s&size=10000 [status:500 request:0.005s]
ERROR:root:Error running query: TransportError(500, u'search_phase_execution_exception', u'Trying to create too many scroll contexts. Must be less than or equal to: [500]. This limit can be set by changing the [search.max_open_scroll_context] setting.')
WARNING:elasticsearch:GET http://helk-elasticsearch:9200/logs-*/_search?ignore_unavailable=true&_source_includes=%40timestamp%2C%2A&scroll=30s&size=10000 [status:500 request:0.010s]
ERROR:root:Error running query: TransportError(500, u'search_phase_execution_exception', u'Trying to create too many scroll contexts. Must be less than or equal to: [500]. This limit can be set by changing the [search.max_open_scroll_context] setting.')

I updated the max_open_scroll_context to 5000 and I still get the same errors.

curl -X PUT localhost:9200/_cluster/settings -H 'Content-Type: application/json' -d'{
    "persistent" : {
        "search.max_open_scroll_context": 5000
    },
    "transient": {
        "search.max_open_scroll_context": 5000
    }
}
'
Qmando commented 5 years ago

I haven't seen this error before, though I haven't used 7 yet. I just merged in https://github.com/Yelp/elastalert/pull/2230 which may solve this problem, though, I haven't released a new package version yet.

There may be some workarounds, like switching to use_count_query or use_terms_query if applicable, or by reducing buffer_time and increasing max_query_size so that it doesn't need to scroll. This completely depends on the amount of data and what your rule looks like though.

You said this happens immediately after you start getting data? Could there be something else with open scrolls other than elastalert? I wouldn't think you'd hit 500 open ones quickly, since elastalert should only be opening one per rule per query cycle.

Cyb3rWard0g commented 5 years ago

Hey @Qmando , thank you for the quick response. testing that now 👍

Cyb3rWard0g commented 5 years ago

Same error messages:

ERROR:root:Error running query: TransportError(500, u'search_phase_execution_exception', u'Trying to create too many scroll contexts. Must be less than or equal to: [5000]. This limit can be set by changing the [search.max_open_scroll_context] setting.')
WARNING:elasticsearch:GET http://helk-elasticsearch:9200/logs-*/_search?ignore_unavailable=true&_source_includes=%40timestamp%2C%2A&scroll=30s&size=10000 [status:500 request:0.012s]
ERROR:root:Error running query: TransportError(500, u'search_phase_execution_exception', u'Trying to create too many scroll contexts. Must be less than or equal to: [5000]. This limit can be set by changing the [search.max_open_scroll_context] setting.')
WARNING:elasticsearch:GET http://helk-elasticsearch:9200/logs-endpoint-winevent-security-*/_search?ignore_unavailable=true&_source_includes=%40timestamp%2C%2A&scroll=30s&size=10000 [status:500 request:0.004s]
ERROR:root:Error running query: TransportError(500, u'search_phase_execution_exception', u'Trying to create too many scroll contexts. Must be less than or equal to: [5000]. This limit can be set by changing the [search.max_open_scroll_context] setting.')
WARNING:elasticsearch:GET http://helk-elasticsearch:9200/logs-*/_search?ignore_unavailable=true&_source_includes=%40timestamp%2C%2A&scroll=30s&size=10000 [status:500 request:0.008s]
ERROR:root:Error running query: TransportError(500, u'search_phase_execution_exception', u'Trying to create too many scroll contexts. Must be less than or equal to: [5000]. This limit can be set by changing the [search.max_open_scroll_context] setting.')
WARNING:elasticsearch:GET http://helk-elasticsearch:9200/logs-endpoint-winevent-*/_search?ignore_unavailable=true&_source_includes=%40timestamp%2C%2A&scroll=30s&size=10000 [status:500 request:0.015s]
ERROR:root:Error running query: TransportError(500, u'search_phase_execution_exception', u'Trying to create too many scroll contexts. Must be less than or equal to: [5000]. This limit can be set by changing the [search.max_open_scroll_context] setting.')
WARNING:elasticsearch:GET http://helk-elasticsearch:9200/logs-endpoint-winevent-sysmon-*/_search?ignore_unavailable=true&_source_includes=%40timestamp%2C%2A&scroll=30s&size=10000 [status:500 request:0.006s]
ERROR:root:Error running query: TransportError(500, u'search_phase_execution_exception', u'Trying to create too many scroll contexts. Must be less than or equal to: [5000]. This limit can be set by changing the [search.max_open_scroll_context] setting.')
WARNING:elasticsearch:GET http://helk-elasticsearch:9200/logs-endpoint-winevent-*/_search?ignore_unavailable=true&_source_includes=%40timestamp%2C%2A&scroll=30s&size=10000 [status:500 request:0.012s]
ERROR:root:Error running query: TransportError(500, u'search_phase_execution_exception', u'Trying to create too many scroll contexts. Must be less than or equal to: [5000]. This limit can be set by changing the [search.max_open_scroll_context] setting.')

I installed latest like this (Part of a Dockerfile):

 # ********* Install Elastalert **************
  && git clone https://github.com/Yelp/elastalert.git ${ESALERT_HOME} \
  && bash -c 'mkdir -pv /etc/elastalert/rules' \
  && cd ${ESALERT_HOME} \
  && python -m pip install --upgrade pip \
  && pip install urllib3==1.24.3 \
  && pip install -r requirements.txt \
  && python setup.py install \

I am using official Docker image docker.elastic.co/elasticsearch/elasticsearch:7.0.1 for this.

Cyb3rWard0g commented 5 years ago

One of the queries failing is very basic:

alert:
- debug
description: Detects potential psexec via powershell module
filter:
- query:
    query_string:
      query: registry_key_value:(COMSPEC OR powershell.exe) OR service_image_path:(COMSPEC OR powershell.exe)
index: logs-endpoint-winevent-*
name: Windows-Psexec-Powershell-Module_0
priority: 2
realert:
  minutes: 0
type: any

I get the following in the status_error index: image

Cyb3rWard0g commented 5 years ago

This is my main config.yaml:

rules_folder: rules
run_every:
    seconds: 30
buffer_time:
    seconds: 45
es_host: helk-elasticsearch
es_port: 9200
alert_time_limit:
    days: 1
writeback_index: elastalert_status
alert_text: "Index: {0} \nEvent_Timestamp: {1} \nBeat_Name: {2} \nUser_Name: {3} \nHost_Name: {4} \nLog_Name: {5} \nOriginal_Message: \n\n{6}"
alert_text_type: alert_text_only
alert_text_args: ["_index","@timestamp","beat.name","user_name","host_name","log_name","z_original_message"]

There are a total of 20k records in my database but the data has timestamps from March/April. Therefore, my query is not even supposed to find anything in the past 1 day.

image

Qmando commented 5 years ago

Hmm...

One thing you should check is that your version of the elasticsearch Python library is > 7. Maybe this is caused by some API change.

You can see from GET http://helk-elasticsearch:9200/logs-*/_search?ignore_unavailable=true&_source_includes=%40timestamp%2C%2A&scroll=30s&size=10000 that it's setting each scroll to a 30 second timeout. 7.0 changed a bunch of APIs so maybe that assumption is no longer right.

I really don't have much expertise here but I'd probably check to see whether there are other APIs you can use to see what scrolls are active, then you could manually clear them or verify that they are coming from elastalert and not something else.

From https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-scroll.html

You can check how many search contexts are open with the nodes stats API:
GET /_nodes/stats/indices/search
Cyb3rWard0g commented 5 years ago

Regarding library being newer than 7, you are right, 7.0.1 was released a week ago: https://github.com/elastic/elasticsearch-py/releases/tag/7.0.1 👍 . I will test that first.

Cyb3rWard0g commented 5 years ago

Oh wait I am actually running the Elasticsearch python library 7.0.1

Successfully installed PyJWT-1.7.1 PyStaticConfiguration-0.10.4 PyYAML-5.1 asn1crypto-0.24.0 attrs-19.1.0 aws-requests-auth-0.4.2 blist-1.3.6 boto3-1.9.151 botocore-1.12.151 certifi-2019.3.9 cffi-1.12.3 chardet-3.0.4 configparser-3.7.4 croniter-0.3.30 cryptography-2.6.1 defusedxml-0.6.0 docopt-0.6.2 docutils-0.14 elasticsearch-7.0.1 enum34-1.1.6 envparse-0.2.0 exotel-0.1.5 funcsigs-1.0.2 functools32-3.2.3.post2 future-0.17.1 futures-3.2.0 idna-2.8 ipaddress-1.0.22 jira-1.0.14 jmespath-0.9.4 jsonschema-3.0.1 mock-3.0.5 oauthlib-3.0.1 ordereddict-1.1 pbr-5.2.0 pyOpenSSL-19.0.0 pycparser-2.19 pyrsistent-0.15.2 python-dateutil-2.6.1 python-magic-0.4.15 pytz-2019.1 requests-2.22.0 requests-oauthlib-1.2.0 requests-toolbelt-0.9.1 s3transfer-0.2.0 six-1.12.0 stomp.py-4.1.22 texttable-1.6.1 thehive4py-1.6.0 twilio-6.0.0 urllib3-1.25.2

I will check the search contexts open

Cyb3rWard0g commented 5 years ago

I get the following:

helk@ubuntu:~/HELK/docker$ sudo docker exec -ti helk-elastalert curl -X GET helk-elasticsearch:9200/_nodes/stats/indices/search?pretty
{
  "_nodes" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  },
  "cluster_name" : "helk-cluster",
  "nodes" : {
    "TioiZU7KQ6-7ibvmdohLEA" : {
      "timestamp" : 1558131129964,
      "name" : "helk-1",
      "transport_address" : "172.18.0.7:9300",
      "host" : "172.18.0.7",
      "ip" : "172.18.0.7:9300",
      "roles" : [
        "master",
        "data",
        "ingest"
      ],
      "attributes" : {
        "ml.machine_memory" : "13549408256",
        "xpack.installed" : "true",
        "ml.max_open_jobs" : "20"
      },
      "indices" : {
        "search" : {
          "open_contexts" : 3705,
          "query_total" : 261321,
          "query_time_in_millis" : 11645,
          "query_current" : 0,
          "fetch_total" : 5726,
          "fetch_time_in_millis" : 19415,
          "fetch_current" : 0,
          "scroll_total" : 246750,
          "scroll_time_in_millis" : 13122111458,
          "scroll_current" : 3705,
          "suggest_total" : 0,
          "suggest_time_in_millis" : 0,
          "suggest_current" : 0
        }
      }
    }
  }
}
Qmando commented 5 years ago

To be honest I'm not entirely sure how to interpret

          "scroll_total" : 246750,
          "scroll_time_in_millis" : 13122111458,
          "scroll_current" : 3705,

that scroll time seems ridiculously high. 13122111 seconds is 5 months!!

If you want to try and debug further, you can use --es_debug_trace es.log parameter to dump the full requests being made to Elasticsearch. If it's an ElastAlert issue, I'll try to fix asap, but I'm out of my expertise here.

Cyb3rWard0g commented 5 years ago

Thank you for the details and extra steps to try. yeah 5 months is crazy. Let me create a fresh ES server and fresh elastalert without any data in it. Does the --es_debug_trace es.log go next to the --verbose command when starting Elastalert?

Qmando commented 5 years ago

Next to --verbose is fine (order doesn't matter). You should see a bunch of stuff in there after running elastalert for a short time. Just note it prints out things with curl localhost:9200 despite whatever settings you have, a "feature" of the elasticsearch library. ;)

Cyb3rWard0g commented 5 years ago

Ok I have a fresh ES server:

helk@ubuntu:~/HELK/docker$ sudo docker exec -ti helk-elasticsearch curl -X GET "localhost:9200/_nodes/stats/indices/search?pretty"
{
  "_nodes" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  },
  "cluster_name" : "helk-cluster",
  "nodes" : {
    "Y9alb0wgTwO1Dj_8yEJwYw" : {
      "timestamp" : 1558132609174,
      "name" : "helk-1",
      "transport_address" : "172.18.0.2:9300",
      "host" : "172.18.0.2",
      "ip" : "172.18.0.2:9300",
      "roles" : [
        "master",
        "data",
        "ingest"
      ],
      "attributes" : {
        "ml.machine_memory" : "13549408256",
        "xpack.installed" : "true",
        "ml.max_open_jobs" : "20"
      },
      "indices" : {
        "search" : {
          "open_contexts" : 0,
          "query_total" : 0,
          "query_time_in_millis" : 0,
          "query_current" : 0,
          "fetch_total" : 0,
          "fetch_time_in_millis" : 0,
          "fetch_current" : 0,
          "scroll_total" : 0,
          "scroll_time_in_millis" : 0,
          "scroll_current" : 0,
          "suggest_total" : 0,
          "suggest_time_in_millis" : 0,
          "suggest_current" : 0
        }
      }
    }
  }
}

Then I have only a few rules running without any data in the database:

INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-17 22:50 UTC to 2019-05-17 22:50 UTC: 0 / 0 hits
INFO:elastalert:Ran Windows-Suspicious-Powershell-commands_0 from 2019-05-17 22:50 UTC to 2019-05-17 22:50 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Queried rule Windows-Cobalt-Strike-Named-Pipes_0 from 2019-05-17 22:50 UTC to 2019-05-17 22:50 UTC: 0 / 0 hits
INFO:elastalert:Ran Windows-Cobalt-Strike-Named-Pipes_0 from 2019-05-17 22:50 UTC to 2019-05-17 22:50 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Queried rule Windows-Psexec-Powershell-Module_0 from 2019-05-17 22:50 UTC to 2019-05-17 22:50 UTC: 0 / 0 hits
INFO:elastalert:Ran Windows-Psexec-Powershell-Module_0 from 2019-05-17 22:50 UTC to 2019-05-17 22:50 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Sleeping for 29.955974 seconds
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-17 22:50 UTC to 2019-05-17 22:51 UTC: 0 / 0 hits
INFO:elastalert:Ran Windows-Suspicious-Powershell-commands_0 from 2019-05-17 22:50 UTC to 2019-05-17 22:51 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Queried rule Windows-Cobalt-Strike-Named-Pipes_0 from 2019-05-17 22:50 UTC to 2019-05-17 22:51 UTC: 0 / 0 hits
INFO:elastalert:Ran Windows-Cobalt-Strike-Named-Pipes_0 from 2019-05-17 22:50 UTC to 2019-05-17 22:51 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Queried rule Windows-Psexec-Powershell-Module_0 from 2019-05-17 22:50 UTC to 2019-05-17 22:51 UTC: 0 / 0 hits
INFO:elastalert:Ran Windows-Psexec-Powershell-Module_0 from 2019-05-17 22:50 UTC to 2019-05-17 22:51 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Sleeping for 29.913904 seconds
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-17 22:51 UTC to 2019-05-17 22:51 UTC: 0 / 0 hits
INFO:elastalert:Ran Windows-Suspicious-Powershell-commands_0 from 2019-05-17 22:51 UTC to 2019-05-17 22:51 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Queried rule Windows-Cobalt-Strike-Named-Pipes_0 from 2019-05-17 22:51 UTC to 2019-05-17 22:51 UTC: 0 / 0 hits
INFO:elastalert:Ran Windows-Cobalt-Strike-Named-Pipes_0 from 2019-05-17 22:51 UTC to 2019-05-17 22:51 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Queried rule Windows-Psexec-Powershell-Module_0 from 2019-05-17 22:51 UTC to 2019-05-17 22:51 UTC: 0 / 0 hits
INFO:elastalert:Ran Windows-Psexec-Powershell-Module_0 from 2019-05-17 22:51 UTC to 2019-05-17 22:51 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Sleeping for 29.960197 seconds

Im also tailing the es.log file:

curl -XGET 'http://localhost:9200/?pretty' -d ''
curl -H 'Content-Type: application/json' -XGET 'http://localhost:9200/logs-endpoint-winevent-*/_search?pretty&ignore_unavailable=true&_source_includes=%40timestamp%2C%2A&scroll=30s&size=10000' -d '{
  "query": {
    "bool": {
      "filter": {
        "bool": {
          "must": [
            {
              "range": {
                "@timestamp": {
                  "gt": "2019-05-17T22:51:36.602383Z",
                  "lte": "2019-05-17T22:52:21.602383Z"
                }
              }
            },
            {
              "query_string": {
                "query": "registry_key_value:(COMSPEC OR powershell.exe) OR service_image_path:(COMSPEC OR powershell.exe)"
              }
            }
          ]
        }
      }
    }
  },
  "sort": [
    {
      "@timestamp": {
        "order": "asc"
      }
    }
  ]
}'
curl -H 'Content-Type: application/json' -XPOST 'http://localhost:9200/elastalert_status_status/_doc?pretty' -d '{
  "@timestamp": "2019-05-17T22:52:21.614659Z",
  "endtime": "2019-05-17T22:52:21.602383Z",
  "hits": 0,
  "matches": 0,
  "rule_name": "Windows-Psexec-Powershell-Module_0",
  "starttime": "2019-05-17T22:51:36.602383Z",
  "time_taken": 0.012163877487182617
}'

After a few minutes I checked the search context open and these are the results. Without data in the database:

{
  "_nodes" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  },
  "cluster_name" : "helk-cluster",
  "nodes" : {
    "Y9alb0wgTwO1Dj_8yEJwYw" : {
      "timestamp" : 1558133480144,
      "name" : "helk-1",
      "transport_address" : "172.18.0.2:9300",
      "host" : "172.18.0.2",
      "ip" : "172.18.0.2:9300",
      "roles" : [
        "master",
        "data",
        "ingest"
      ],
      "attributes" : {
        "ml.machine_memory" : "13549408256",
        "xpack.installed" : "true",
        "ml.max_open_jobs" : "20"
      },
      "indices" : {
        "search" : {
          "open_contexts" : 0,
          "query_total" : 1022,
          "query_time_in_millis" : 653,
          "query_current" : 0,
          "fetch_total" : 1022,
          "fetch_time_in_millis" : 401,
          "fetch_current" : 0,
          "scroll_total" : 0,
          "scroll_time_in_millis" : 0,
          "scroll_current" : 0,
          "suggest_total" : 0,
          "suggest_time_in_millis" : 0,
          "suggest_current" : 0
        }
      }
    }
  }
}
Cyb3rWard0g commented 5 years ago

I now sent some data to the database:

{
  "_nodes" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  },
  "cluster_name" : "helk-cluster",
  "nodes" : {
    "Y9alb0wgTwO1Dj_8yEJwYw" : {
      "timestamp" : 1558133755977,
      "name" : "helk-1",
      "transport_address" : "172.18.0.2:9300",
      "host" : "172.18.0.2",
      "ip" : "172.18.0.2:9300",
      "roles" : [
        "master",
        "data",
        "ingest"
      ],
      "attributes" : {
        "ml.machine_memory" : "13549408256",
        "xpack.installed" : "true",
        "ml.max_open_jobs" : "20"
      },
      "indices" : {
        "search" : {
          "open_contexts" : 51,
          "query_total" : 1266,
          "query_time_in_millis" : 1291,
          "query_current" : 0,
          "fetch_total" : 1196,
          "fetch_time_in_millis" : 1738,
          "fetch_current" : 0,
          "scroll_total" : 17,
          "scroll_time_in_millis" : 716175,
          "scroll_current" : 51,
          "suggest_total" : 0,
          "suggest_time_in_millis" : 0,
          "suggest_current" : 0
        }
      }
    }
  }
}

I dont see any of the error messages from before:

INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-17 22:54 UTC to 2019-05-17 22:54 UTC: 0 / 0 hits
INFO:elastalert:Ran Windows-Suspicious-Powershell-commands_0 from 2019-05-17 22:54 UTC to 2019-05-17 22:54 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Queried rule Windows-Cobalt-Strike-Named-Pipes_0 from 2019-05-17 22:54 UTC to 2019-05-17 22:54 UTC: 0 / 0 hits
INFO:elastalert:Ran Windows-Cobalt-Strike-Named-Pipes_0 from 2019-05-17 22:54 UTC to 2019-05-17 22:54 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Queried rule Windows-Psexec-Powershell-Module_0 from 2019-05-17 22:54 UTC to 2019-05-17 22:54 UTC: 0 / 0 hits
INFO:elastalert:Ran Windows-Psexec-Powershell-Module_0 from 2019-05-17 22:54 UTC to 2019-05-17 22:54 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Sleeping for 29.922236 seconds
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-17 22:54 UTC to 2019-05-17 22:55 UTC: 0 / 0 hits
INFO:elastalert:Ran Windows-Suspicious-Powershell-commands_0 from 2019-05-17 22:54 UTC to 2019-05-17 22:55 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Queried rule Windows-Cobalt-Strike-Named-Pipes_0 from 2019-05-17 22:54 UTC to 2019-05-17 22:55 UTC: 0 / 0 hits
INFO:elastalert:Ran Windows-Cobalt-Strike-Named-Pipes_0 from 2019-05-17 22:54 UTC to 2019-05-17 22:55 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Queried rule Windows-Psexec-Powershell-Module_0 from 2019-05-17 22:54 UTC to 2019-05-17 22:55 UTC: 0 / 0 hits
INFO:elastalert:Ran Windows-Psexec-Powershell-Module_0 from 2019-05-17 22:54 UTC to 2019-05-17 22:55 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Sleeping for 29.953053 seconds
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-17 22:55 UTC to 2019-05-17 22:55 UTC: 0 / 0 hits
INFO:elastalert:Ran Windows-Suspicious-Powershell-commands_0 from 2019-05-17 22:55 UTC to 2019-05-17 22:55 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Queried rule Windows-Cobalt-Strike-Named-Pipes_0 from 2019-05-17 22:55 UTC to 2019-05-17 22:55 UTC: 0 / 0 hits
INFO:elastalert:Ran Windows-Cobalt-Strike-Named-Pipes_0 from 2019-05-17 22:55 UTC to 2019-05-17 22:55 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Queried rule Windows-Psexec-Powershell-Module_0 from 2019-05-17 22:55 UTC to 2019-05-17 22:55 UTC: 0 / 0 hits
INFO:elastalert:Ran Windows-Psexec-Powershell-Module_0 from 2019-05-17 22:55 UTC to 2019-05-17 22:55 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Sleeping for 29.95311 seconds
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-17 22:55 UTC to 2019-05-17 22:56 UTC: 0 / 0 hits
INFO:elastalert:Ran Windows-Suspicious-Powershell-commands_0 from 2019-05-17 22:55 UTC to 2019-05-17 22:56 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Queried rule Windows-Cobalt-Strike-Named-Pipes_0 from 2019-05-17 22:55 UTC to 2019-05-17 22:56 UTC: 0 / 0 hits
INFO:elastalert:Ran Windows-Cobalt-Strike-Named-Pipes_0 from 2019-05-17 22:55 UTC to 2019-05-17 22:56 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Queried rule Windows-Psexec-Powershell-Module_0 from 2019-05-17 22:55 UTC to 2019-05-17 22:56 UTC: 0 / 0 hits
INFO:elastalert:Ran Windows-Psexec-Powershell-Module_0 from 2019-05-17 22:55 UTC to 2019-05-17 22:56 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Sleeping for 29.941673 seconds

I checked the es.log:

curl -H 'Content-Type: application/json' -XGET 'http://localhost:9200/logs-endpoint-winevent-*/_search?pretty&ignore_unavailable=true&_source_includes=%40timestamp%2C%2A&scroll=30s&size=10000' -d '{
  "query": {
    "bool": {
      "filter": {
        "bool": {
          "must": [
            {
              "range": {
                "@timestamp": {
                  "gt": "2019-05-17T22:56:06.759775Z",
                  "lte": "2019-05-17T22:56:51.759775Z"
                }
              }
            },
            {
              "query_string": {
                "query": "registry_key_value:(COMSPEC OR powershell.exe) OR service_image_path:(COMSPEC OR powershell.exe)"
              }
            }
          ]
        }
      }
    }
  },
  "sort": [
    {
      "@timestamp": {
        "order": "asc"
      }
    }
  ]
}'
curl -H 'Content-Type: application/json' -XPOST 'http://localhost:9200/elastalert_status_status/_doc?pretty' -d '{
  "@timestamp": "2019-05-17T22:56:51.770938Z",
  "endtime": "2019-05-17T22:56:51.759775Z",
  "hits": 0,
  "matches": 0,
  "rule_name": "Windows-Psexec-Powershell-Module_0",
  "starttime": "2019-05-17T22:56:06.759775Z",
  "time_taken": 0.011156082153320312
}'

Everything looks normal at the moment.

Cyb3rWard0g commented 5 years ago

The only difference is the number of rules I am using. I am only running 3 rules at the moment. I believe some rules might be triggering that behavior. I have around 210 rules besides the 3 I used for this test.

Cyb3rWard0g commented 5 years ago

I let my ELK stack and Elastalert running the whole night and I got the same scroll errors with only 3 rules:

INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 01:48 UTC to 2019-05-18 01:48 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 01:48 UTC to 2019-05-18 01:49 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 01:49 UTC to 2019-05-18 01:50 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 01:50 UTC to 2019-05-18 01:51 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 01:51 UTC to 2019-05-18 01:51 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 01:51 UTC to 2019-05-18 01:52 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 01:52 UTC to 2019-05-18 01:53 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 01:53 UTC to 2019-05-18 01:54 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 01:54 UTC to 2019-05-18 01:54 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 01:54 UTC to 2019-05-18 01:55 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 01:55 UTC to 2019-05-18 01:56 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 01:56 UTC to 2019-05-18 01:57 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 01:57 UTC to 2019-05-18 01:57 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 01:57 UTC to 2019-05-18 01:58 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 01:58 UTC to 2019-05-18 01:59 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 01:59 UTC to 2019-05-18 02:00 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 02:00 UTC to 2019-05-18 02:00 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 02:00 UTC to 2019-05-18 02:01 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 02:01 UTC to 2019-05-18 02:02 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 02:02 UTC to 2019-05-18 02:03 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 02:03 UTC to 2019-05-18 02:03 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 02:03 UTC to 2019-05-18 02:04 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 02:04 UTC to 2019-05-18 02:05 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 02:05 UTC to 2019-05-18 02:06 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 02:06 UTC to 2019-05-18 02:06 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 02:06 UTC to 2019-05-18 02:07 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 02:07 UTC to 2019-05-18 02:08 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 02:08 UTC to 2019-05-18 02:09 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 02:09 UTC to 2019-05-18 02:09 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 02:09 UTC to 2019-05-18 02:10 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 02:10 UTC to 2019-05-18 02:11 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 02:11 UTC to 2019-05-18 02:12 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 02:12 UTC to 2019-05-18 02:12 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 02:12 UTC to 2019-05-18 02:13 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 02:13 UTC to 2019-05-18 02:14 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 02:14 UTC to 2019-05-18 02:15 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 02:15 UTC to 2019-05-18 02:15 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 02:15 UTC to 2019-05-18 02:16 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 02:16 UTC to 2019-05-18 02:17 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 02:17 UTC to 2019-05-18 02:18 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 02:18 UTC to 2019-05-18 02:18 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 02:18 UTC to 2019-05-18 02:19 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 02:19 UTC to 2019-05-18 02:20 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 02:20 UTC to 2019-05-18 02:21 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 02:21 UTC to 2019-05-18 02:21 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 02:21 UTC to 2019-05-18 02:22 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 02:22 UTC to 2019-05-18 02:23 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 02:23 UTC to 2019-05-18 02:24 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 02:24 UTC to 2019-05-18 02:24 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 02:24 UTC to 2019-05-18 02:25 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 02:25 UTC to 2019-05-18 02:26 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 02:26 UTC to 2019-05-18 02:27 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 02:27 UTC to 2019-05-18 02:27 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 02:27 UTC to 2019-05-18 02:28 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 02:28 UTC to 2019-05-18 02:29 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 02:29 UTC to 2019-05-18 02:30 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 02:30 UTC to 2019-05-18 02:30 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 02:30 UTC to 2019-05-18 02:31 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 02:31 UTC to 2019-05-18 02:32 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 02:32 UTC to 2019-05-18 02:33 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 02:33 UTC to 2019-05-18 02:33 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 02:33 UTC to 2019-05-18 02:34 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 02:34 UTC to 2019-05-18 02:35 UTC: 0 / 0 hits
WARNING:elasticsearch:GET http://helk-elasticsearch:9200/logs-endpoint-winevent-*/_search?ignore_unavailable=true&_source_includes=%40timestamp%2C%2A&scroll=30s&size=10000 [status:500 request:0.012s]
ERROR:root:Error running query: TransportError(500, u'search_phase_execution_exception', u'Trying to create too many scroll contexts. Must be less than or equal to: [500]. This limit can be set by changing the [search.max_open_scroll_context] setting.')
INFO:elastalert:Ran Windows-Suspicious-Powershell-commands_0 from 2019-05-18 01:48 UTC to 2019-05-18 13:53 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
WARNING:elasticsearch:GET http://helk-elasticsearch:9200/logs-endpoint-winevent-sysmon-*/_search?ignore_unavailable=true&_source_includes=%40timestamp%2C%2A&scroll=30s&size=10000 [status:500 request:0.005s]
ERROR:root:Error running query: TransportError(500, u'search_phase_execution_exception', u'Trying to create too many scroll contexts. Must be less than or equal to: [500]. This limit can be set by changing the [search.max_open_scroll_context] setting.')
INFO:elastalert:Ran Windows-Cobalt-Strike-Named-Pipes_0 from 2019-05-18 01:48 UTC to 2019-05-18 13:53 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
WARNING:elasticsearch:GET http://helk-elasticsearch:9200/logs-endpoint-winevent-*/_search?ignore_unavailable=true&_source_includes=%40timestamp%2C%2A&scroll=30s&size=10000 [status:500 request:0.015s]
ERROR:root:Error running query: TransportError(500, u'search_phase_execution_exception', u'Trying to create too many scroll contexts. Must be less than or equal to: [500]. This limit can be set by changing the [search.max_open_scroll_context] setting.')
INFO:elastalert:Ran Windows-Psexec-Powershell-Module_0 from 2019-05-18 01:48 UTC to 2019-05-18 13:53 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Sleeping for 29.569125 seconds
Cyb3rWard0g commented 5 years ago

This is what I got from the es.log

curl -H 'Content-Type: application/json' -XPOST 'http://localhost:9200/elastalert_status_error/_doc?pretty' -d '{
  "@timestamp": "2019-05-18T13:54:53.171471Z",
  "data": {
    "query": {
      "query": {
        "bool": {
          "filter": {
            "bool": {
              "must": [
                {
                  "range": {
                    "@timestamp": {
                      "gt": "2019-05-18T01:48:01.123530Z",
                      "lte": "2019-05-18T01:48:46.123530Z"
                    }
                  }
                },
                {
                  "query_string": {
                    "query": "registry_key_value:(COMSPEC OR powershell.exe) OR service_image_path:(COMSPEC OR powershell.exe)"
                  }
                }
              ]
            }
          }
        }
      },
      "sort": [
        {
          "@timestamp": {
            "order": "asc"
          }
        }
      ]
    },
    "rule": "Windows-Psexec-Powershell-Module_0"
  },
  "message": "Error running query: TransportError(500, u\u0027search_phase_execution_exception\u0027, u\u0027Trying to create too many scroll contexts. Must be less than or equal to: [500]. This limit can be set by changing the [search.max_open_scroll_context] setting.\u0027)",
  "traceback": [
    "Traceback (most recent call last):",
    "  File \"/usr/local/lib/python2.7/dist-packages/elastalert-0.1.39-py2.7.egg/elastalert/elastalert.py\", line 352, in get_hits",
    "    **extra_args",
    "  File \"/usr/local/lib/python2.7/dist-packages/elasticsearch/client/utils.py\", line 84, in _wrapped",
    "    return func(*args, params=params, **kwargs)",
    "  File \"/usr/local/lib/python2.7/dist-packages/elasticsearch/client/__init__.py\", line 818, in search",
    "    \"GET\", _make_path(index, \"_search\"), params=params, body=body",
    "  File \"/usr/local/lib/python2.7/dist-packages/elasticsearch/transport.py\", line 353, in perform_request",
    "    timeout=timeout,",
    "  File \"/usr/local/lib/python2.7/dist-packages/elasticsearch/connection/http_requests.py\", line 143, in perform_request",
    "    self._raise_error(response.status_code, raw_data)",
    "  File \"/usr/local/lib/python2.7/dist-packages/elasticsearch/connection/base.py\", line 168, in _raise_error",
    "    status_code, error_message, additional_info",
    "TransportError: TransportError(500, u\u0027search_phase_execution_exception\u0027, u\u0027Trying to create too many scroll contexts. Must be less than or equal to: [500]. This limit can be set by changing the [search.max_open_scroll_context] setting.\u0027)"
  ]
}'
Cyb3rWard0g commented 5 years ago

This is what I got also from the scroll context:

helk@ubuntu:~$ sudo docker exec -ti helk-elasticsearch curl -X GET "localhost:9200/_nodes/stats/indices/search?pretty"
{
  "_nodes" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  },
  "cluster_name" : "helk-cluster",
  "nodes" : {
    "Y9alb0wgTwO1Dj_8yEJwYw" : {
      "timestamp" : 1558187755265,
      "name" : "helk-1",
      "transport_address" : "172.18.0.2:9300",
      "host" : "172.18.0.2",
      "ip" : "172.18.0.2:9300",
      "roles" : [
        "master",
        "data",
        "ingest"
      ],
      "attributes" : {
        "ml.machine_memory" : "13549408256",
        "xpack.installed" : "true",
        "ml.max_open_jobs" : "20"
      },
      "indices" : {
        "search" : {
          "open_contexts" : 500,
          "query_total" : 19616,
          "query_time_in_millis" : 2208,
          "query_current" : 0,
          "fetch_total" : 6052,
          "fetch_time_in_millis" : 2648,
          "fetch_current" : 0,
          "scroll_total" : 13402,
          "scroll_time_in_millis" : 729151894,
          "scroll_current" : 500,
          "suggest_total" : 0,
          "suggest_time_in_millis" : 0,
          "suggest_current" : 0
        }
      }
    }
  }
}
Cyb3rWard0g commented 5 years ago

I only have around 1k records but from March image

Is there anything else that you would recommend to test to provide information? @Qmando

Cyb3rWard0g commented 5 years ago

Something that I noticed was the time widows used for the queries:

INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 02:32 UTC to 2019-05-18 02:33 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 02:33 UTC to 2019-05-18 02:33 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 02:33 UTC to 2019-05-18 02:34 UTC: 0 / 0 hits
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 02:34 UTC to 2019-05-18 02:35 UTC: 0 / 0 hits

I see from 14H 34 mins to 14H 35 mins. I am still learning Elastalert so I wasnt sure if that is the most ideal. This is part of my Elastalert config:

rules_folder: rules
run_every:
    seconds: 30
buffer_time:
    seconds: 45
es_host: helk-elasticsearch
es_port: 9200
alert_time_limit:
    days: 1

Is the 1 mins time window because of the buffer_time? -UPDATE: It was buffer time. Now I have a 15mins window per query.

Cyb3rWard0g commented 5 years ago

This is what the config.yaml.example look like:

run_every:
  minutes: 1

# ElastAlert will buffer results from the most recent
# period of time, in case some log sources are not in real time
buffer_time:
  minutes: 15

# If an alert fails for some reason, ElastAlert will retry
# sending the alert until this time period has elapsed
alert_time_limit:
  days: 2

I will change it to the defaults for now and let it run

Cyb3rWard0g commented 5 years ago

Ok I updated the following to 1m:

scroll_keepalive: The maximum time (formatted in Time Units) the scrolling context should be kept alive. Avoid using high values as it abuses resources in Elasticsearch, but be mindful to allow sufficient time to finish processing all the results.

the examples provided by Elastic docs: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-scroll.html

That changed has stopped so far the error messages:

INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 14:33 UTC to 2019-05-18 14:48 UTC: 0 / 0 hits
INFO:elastalert:Ran Windows-Suspicious-Powershell-commands_0 from 2019-05-18 14:33 UTC to 2019-05-18 14:48 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Queried rule Windows-Cobalt-Strike-Named-Pipes_0 from 2019-05-18 14:33 UTC to 2019-05-18 14:48 UTC: 0 / 0 hits
INFO:elastalert:Ran Windows-Cobalt-Strike-Named-Pipes_0 from 2019-05-18 14:33 UTC to 2019-05-18 14:48 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Queried rule Windows-Psexec-Powershell-Module_0 from 2019-05-18 14:33 UTC to 2019-05-18 14:48 UTC: 0 / 0 hits
INFO:elastalert:Ran Windows-Psexec-Powershell-Module_0 from 2019-05-18 14:33 UTC to 2019-05-18 14:48 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Sleeping for 59.953273 seconds
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 14:34 UTC to 2019-05-18 14:49 UTC: 0 / 0 hits
INFO:elastalert:Ran Windows-Suspicious-Powershell-commands_0 from 2019-05-18 14:34 UTC to 2019-05-18 14:49 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Queried rule Windows-Cobalt-Strike-Named-Pipes_0 from 2019-05-18 14:34 UTC to 2019-05-18 14:49 UTC: 0 / 0 hits
INFO:elastalert:Ran Windows-Cobalt-Strike-Named-Pipes_0 from 2019-05-18 14:34 UTC to 2019-05-18 14:49 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Queried rule Windows-Psexec-Powershell-Module_0 from 2019-05-18 14:34 UTC to 2019-05-18 14:49 UTC: 0 / 0 hits
INFO:elastalert:Ran Windows-Psexec-Powershell-Module_0 from 2019-05-18 14:34 UTC to 2019-05-18 14:49 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Sleeping for 59.933993 seconds
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 14:35 UTC to 2019-05-18 14:50 UTC: 0 / 0 hits
INFO:elastalert:Ran Windows-Suspicious-Powershell-commands_0 from 2019-05-18 14:35 UTC to 2019-05-18 14:50 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Queried rule Windows-Cobalt-Strike-Named-Pipes_0 from 2019-05-18 14:35 UTC to 2019-05-18 14:50 UTC: 0 / 0 hits
INFO:elastalert:Ran Windows-Cobalt-Strike-Named-Pipes_0 from 2019-05-18 14:35 UTC to 2019-05-18 14:50 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Queried rule Windows-Psexec-Powershell-Module_0 from 2019-05-18 14:35 UTC to 2019-05-18 14:50 UTC: 0 / 0 hits
INFO:elastalert:Ran Windows-Psexec-Powershell-Module_0 from 2019-05-18 14:35 UTC to 2019-05-18 14:50 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Sleeping for 59.929616 seconds
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 14:36 UTC to 2019-05-18 14:51 UTC: 0 / 0 hits
INFO:elastalert:Ran Windows-Suspicious-Powershell-commands_0 from 2019-05-18 14:36 UTC to 2019-05-18 14:51 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Queried rule Windows-Cobalt-Strike-Named-Pipes_0 from 2019-05-18 14:36 UTC to 2019-05-18 14:51 UTC: 0 / 0 hits
INFO:elastalert:Ran Windows-Cobalt-Strike-Named-Pipes_0 from 2019-05-18 14:36 UTC to 2019-05-18 14:51 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Queried rule Windows-Psexec-Powershell-Module_0 from 2019-05-18 14:36 UTC to 2019-05-18 14:51 UTC: 0 / 0 hits
INFO:elastalert:Ran Windows-Psexec-Powershell-Module_0 from 2019-05-18 14:36 UTC to 2019-05-18 14:51 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Sleeping for 59.952816 seconds
Cyb3rWard0g commented 5 years ago

I added all the rules I use (around 210 rules) to it and I started getting all the errors about scrolling again. Something that I noticed on several rules that were potentially causing issues is that the rules were being applied to multiple indices at once. This is what I mean. i have the following indices: image As you can see, there is one logs-* that points to all of them. 119 rules have specific indices such as logs-endpoint-winevent-security-* or logs-endpoint-winevent-sysmon-* . However, the rest point to logs-*. I took all the elastalert rules with index logs-* and now it seems to be stable:

INFO:elastalert:Queried rule PowerShell-Downgrade-Attack_0 from 2019-05-18 15:19 UTC to 2019-05-18 15:34 UTC: 0 / 0 hits
INFO:elastalert:Ran PowerShell-Downgrade-Attack_0 from 2019-05-18 15:19 UTC to 2019-05-18 15:34 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Queried rule PowerShell-Network-Connections_0 from 2019-05-18 15:19 UTC to 2019-05-18 15:34 UTC: 0 / 0 hits
INFO:elastalert:Ran PowerShell-Network-Connections_0 from 2019-05-18 15:19 UTC to 2019-05-18 15:34 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Queried rule Registry-Persistence-Mechanisms_0 from 2019-05-18 15:19 UTC to 2019-05-18 15:34 UTC: 0 / 0 hits
INFO:elastalert:Ran Registry-Persistence-Mechanisms_0 from 2019-05-18 15:19 UTC to 2019-05-18 15:34 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Queried rule RDP-over-Reverse-SSH-Tunnel-WFP_0 from 2019-05-18 15:19 UTC to 2019-05-18 15:34 UTC: 0 / 0 hits
INFO:elastalert:Ran RDP-over-Reverse-SSH-Tunnel-WFP_0 from 2019-05-18 15:19 UTC to 2019-05-18 15:34 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Queried rule Addition-of-SID-History-to-Active-Directory-Object_0 from 2019-05-18 15:19 UTC to 2019-05-18 15:34 UTC: 0 / 0 hits
INFO:elastalert:Ran Addition-of-SID-History-to-Active-Directory-Object_0 from 2019-05-18 15:19 UTC to 2019-05-18 15:34 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Queried rule UAC-Bypass-via-sdclt_0 from 2019-05-18 15:19 UTC to 2019-05-18 15:34 UTC: 0 / 0 hits
INFO:elastalert:Ran UAC-Bypass-via-sdclt_0 from 2019-05-18 15:19 UTC to 2019-05-18 15:34 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Queried rule WMI-Event-Subscription_0 from 2019-05-18 15:19 UTC to 2019-05-18 15:34 UTC: 0 / 0 hits
INFO:elastalert:Ran WMI-Event-Subscription_0 from 2019-05-18 15:19 UTC to 2019-05-18 15:34 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Queried rule Suspicious-File-Characteristics-due-to-Missing-Fields_0 from 2019-05-18 15:19 UTC to 2019-05-18 15:34 UTC: 0 / 0 hits
INFO:elastalert:Ran Suspicious-File-Characteristics-due-to-Missing-Fields_0 from 2019-05-18 15:19 UTC to 2019-05-18 15:34 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Queried rule Weak-Encryption-Enabled-and-Kerberoast_0 from 2019-05-18 15:19 UTC to 2019-05-18 15:34 UTC: 0 / 0 hits
INFO:elastalert:Ran Weak-Encryption-Enabled-and-Kerberoast_0 from 2019-05-18 15:19 UTC to 2019-05-18 15:34 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Queried rule UAC-Bypass-via-Event-Viewer_0 from 2019-05-18 15:19 UTC to 2019-05-18 15:34 UTC: 0 / 0 hits
INFO:elastalert:Ran UAC-Bypass-via-Event-Viewer_0 from 2019-05-18 15:19 UTC to 2019-05-18 15:34 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Queried rule Ursnif_0 from 2019-05-18 15:19 UTC to 2019-05-18 15:34 UTC: 0 / 0 hits
INFO:elastalert:Ran Ursnif_0 from 2019-05-18 15:19 UTC to 2019-05-18 15:34 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Queried rule NTLM-Logon_0 from 2019-05-18 15:19 UTC to 2019-05-18 15:34 UTC: 0 / 0 hits
INFO:elastalert:Ran NTLM-Logon_0 from 2019-05-18 15:19 UTC to 2019-05-18 15:34 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Ran Multiple-Failed-Logins-with-Different-Accounts-from-Single-Source-System_1 from 2019-05-17 22:38 UTC to 2019-05-18 15:34 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Queried rule Suspicious-PowerShell-Keywords_0 from 2019-05-18 15:19 UTC to 2019-05-18 15:34 UTC: 0 / 0 hits
INFO:elastalert:Ran Suspicious-PowerShell-Keywords_0 from 2019-05-18 15:19 UTC to 2019-05-18 15:34 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Queried rule DHCP-Callout-DLL-installation_0 from 2019-05-18 15:19 UTC to 2019-05-18 15:34 UTC: 0 / 0 hits
INFO:elastalert:Ran DHCP-Callout-DLL-installation_0 from 2019-05-18 15:19 UTC to 2019-05-18 15:34 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Queried rule Microsoft-Malware-Protection-Engine-Crash_0 from 2019-05-18 15:23 UTC to 2019-05-18 15:34 UTC: 0 / 0 hits
INFO:elastalert:Ran Microsoft-Malware-Protection-Engine-Crash_0 from 2019-05-18 15:23 UTC to 2019-05-18 15:34 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Queried rule DHCP-Server-Loaded-the-CallOut-DLL_0 from 2019-05-18 15:19 UTC to 2019-05-18 15:34 UTC: 0 / 0 hits
INFO:elastalert:Ran DHCP-Server-Loaded-the-CallOut-DLL_0 from 2019-05-18 15:19 UTC to 2019-05-18 15:34 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Queried rule Hacktool-Use_0 from 2019-05-18 15:19 UTC to 2019-05-18 15:34 UTC: 0 / 0 hits
INFO:elastalert:Ran Hacktool-Use_0 from 2019-05-18 15:19 UTC to 2019-05-18 15:34 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Queried rule Sticky-Key-Like-Backdoor-Usage-2_0 from 2019-05-18 15:19 UTC to 2019-05-18 15:34 UTC: 0 / 0 hits
INFO:elastalert:Ran Sticky-Key-Like-Backdoor-Usage-2_0 from 2019-05-18 15:19 UTC to 2019-05-18 15:34 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Queried rule PowerShell-Rundll32-Remote-Thread-Creation_0 from 2019-05-18 15:19 UTC to 2019-05-18 15:34 UTC: 0 / 0 hits
INFO:elastalert:Ran PowerShell-Rundll32-Remote-Thread-Creation_0 from 2019-05-18 15:19 UTC to 2019-05-18 15:34 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Queried rule Suspicious-Program-Location-with-Network-Connections_0 from 2019-05-18 15:19 UTC to 2019-05-18 15:34 UTC: 0 / 0 hits
INFO:elastalert:Ran Suspicious-Program-Location-with-Network-Connections_0 from 2019-05-18 15:19 UTC to 2019-05-18 15:34 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Queried rule Kerberos-Manipulation_0 from 2019-05-18 15:19 UTC to 2019-05-18 15:34 UTC: 0 / 0 hits
INFO:elastalert:Ran Kerberos-Manipulation_0 from 2019-05-18 15:19 UTC to 2019-05-18 15:34 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Queried rule Windows-Suspicious-Powershell-commands_0 from 2019-05-18 15:19 UTC to 2019-05-18 15:34 UTC: 0 / 0 hits
INFO:elastalert:Ran Windows-Suspicious-Powershell-commands_0 from 2019-05-18 15:19 UTC to 2019-05-18 15:34 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Queried rule Windows-Cobalt-Strike-Named-Pipes_0 from 2019-05-18 15:19 UTC to 2019-05-18 15:34 UTC: 0 / 0 hits
INFO:elastalert:Ran Windows-Cobalt-Strike-Named-Pipes_0 from 2019-05-18 15:19 UTC to 2019-05-18 15:34 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Queried rule Windows-Psexec-Powershell-Module_0 from 2019-05-18 15:19 UTC to 2019-05-18 15:34 UTC: 0 / 0 hits
INFO:elastalert:Ran Windows-Psexec-Powershell-Module_0 from 2019-05-18 15:19 UTC to 2019-05-18 15:34 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Sleeping for 58.543264 seconds

Do you think this is an Elasticsearch issue at this point?. This is my updated Elastalert config:

rules_folder: rules
run_every:
    minutes: 1
buffer_time:
    minutes: 15
es_host: helk-elasticsearch
es_port: 9200
alert_time_limit:
    days: 1
scroll_keepalive: 1m
writeback_index: elastalert_status
alert_text: "Index: {0} \nEvent_Timestamp: {1} \nBeat_Name: {2} \nUser_Name: {3} \nHost_Name: {4} \nLog_Name: {5} \nOriginal_Message: \n\n{6}"
alert_text_type: alert_text_only
alert_text_args: ["_index","@timestamp","beat.name","user_name","host_name","log_name","z_original_message"]
Cyb3rWard0g commented 5 years ago

This is my latest scroll context info from Elasticsearch:

{
  "_nodes" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  },
  "cluster_name" : "helk-cluster",
  "nodes" : {
    "Y9alb0wgTwO1Dj_8yEJwYw" : {
      "timestamp" : 1558193785858,
      "name" : "helk-1",
      "transport_address" : "172.18.0.2:9300",
      "host" : "172.18.0.2",
      "ip" : "172.18.0.2:9300",
      "roles" : [
        "master",
        "data",
        "ingest"
      ],
      "attributes" : {
        "ml.machine_memory" : "13549408256",
        "xpack.installed" : "true",
        "ml.max_open_jobs" : "20"
      },
      "indices" : {
        "search" : {
          "open_contexts" : 490,
          "query_total" : 74786,
          "query_time_in_millis" : 3959,
          "query_current" : 0,
          "fetch_total" : 11022,
          "fetch_time_in_millis" : 3548,
          "fetch_current" : 0,
          "scroll_total" : 32160,
          "scroll_time_in_millis" : 2014238116,
          "scroll_current" : 490,
          "suggest_total" : 0,
          "suggest_time_in_millis" : 0,
          "suggest_current" : 0
        }
      }
    }
  }
}

scroll_time: 23 days. What does that mean?

Cyb3rWard0g commented 5 years ago

I started ingesting logs now to my ELK stack in real time and I started getting the error messages regarding scrolling:

INFO:elastalert:Ran Backup-Catalog-Deleted_0 from 2019-05-18 15:27 UTC to 2019-05-18 15:42 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Queried rule Sticky-Key-Like-Backdoor-Usage_0 from 2019-05-18 15:27 UTC to 2019-05-18 15:42 UTC: 0 / 0 hits
INFO:elastalert:Ran Sticky-Key-Like-Backdoor-Usage_0 from 2019-05-18 15:27 UTC to 2019-05-18 15:42 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Queried rule User-Added-to-Local-Administrators_0 from 2019-05-18 15:27 UTC to 2019-05-18 15:42 UTC: 0 / 0 hits
INFO:elastalert:Ran User-Added-to-Local-Administrators_0 from 2019-05-18 15:27 UTC to 2019-05-18 15:42 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Ran Rare-Schtasks-Creations_0 from 2019-05-17 22:38 UTC to 2019-05-18 15:42 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Queried rule WMI-Persistence_0 from 2019-05-18 15:27 UTC to 2019-05-18 15:42 UTC: 0 / 0 hits
INFO:elastalert:Ran WMI-Persistence_0 from 2019-05-18 15:27 UTC to 2019-05-18 15:42 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Queried rule PsExec-Tool-Execution_0 from 2019-05-18 15:27 UTC to 2019-05-18 15:42 UTC: 0 / 0 hits
INFO:elastalert:Ran PsExec-Tool-Execution_0 from 2019-05-18 15:27 UTC to 2019-05-18 15:42 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
INFO:elastalert:Queried rule DNS-Server-Error-Failed-Loading-the-ServerLevelPluginDLL_0 from 2019-05-18 15:27 UTC to 2019-05-18 15:42 UTC: 0 / 0 hits
INFO:elastalert:Ran DNS-Server-Error-Failed-Loading-the-ServerLevelPluginDLL_0 from 2019-05-18 15:27 UTC to 2019-05-18 15:42 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
WARNING:elasticsearch:GET http://helk-elasticsearch:9200/logs-*/_search?ignore_unavailable=true&_source_includes=%40timestamp%2C%2A&scroll=1m&size=10000 [status:500 request:0.008s]
ERROR:root:Error running query: TransportError(500, u'search_phase_execution_exception', u'Trying to create too many scroll contexts. Must be less than or equal to: [500]. This limit can be set by changing the [search.max_open_scroll_context] setting.')
INFO:elastalert:Ran CMSTP-Execution-2_0 from 2019-05-18 15:27 UTC to 2019-05-18 15:42 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
WARNING:elasticsearch:GET http://helk-elasticsearch:9200/logs-endpoint-winevent-system-*/_search?ignore_unavailable=true&_source_includes=%40timestamp%2C%2A&scroll=1m&size=10000 [status:500 request:0.005s]
ERROR:root:Error running query: TransportError(500, u'search_phase_execution_exception', u'Trying to create too many scroll contexts. Must be less than or equal to: [500]. This limit can be set by changing the [search.max_open_scroll_context] setting.')
INFO:elastalert:Ran SAM-Dump-to-AppData_0 from 2019-05-18 15:27 UTC to 2019-05-18 15:42 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
WARNING:elasticsearch:GET http://helk-elasticsearch:9200/logs-endpoint-winevent-security-*/_search?ignore_unavailable=true&_source_includes=%40timestamp%2C%2A&scroll=1m&size=10000 [status:500 request:0.003s]
ERROR:root:Error running query: TransportError(500, u'search_phase_execution_exception', u'Trying to create too many scroll contexts. Must be less than or equal to: [500]. This limit can be set by changing the [search.max_open_scroll_context] setting.')
INFO:elastalert:Ran Active-Directory-User-Backdoors_0 from 2019-05-18 15:27 UTC to 2019-05-18 15:42 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
WARNING:elasticsearch:GET http://helk-elasticsearch:9200/logs-endpoint-winevent-security-*/_search?ignore_unavailable=true&_source_includes=%40timestamp%2C%2A&scroll=1m&size=10000 [status:500 request:0.005s]
ERROR:root:Error running query: TransportError(500, u'search_phase_execution_exception', u'Trying to create too many scroll contexts. Must be less than or equal to: [500]. This limit can be set by changing the [search.max_open_scroll_context] setting.')
INFO:elastalert:Ran NetNTLM-Downgrade-Attack-2_0 from 2019-05-18 15:27 UTC to 2019-05-18 15:42 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
WARNING:elasticsearch:GET http://helk-elasticsearch:9200/logs-endpoint-winevent-sysmon-*/_search?ignore_unavailable=true&_source_includes=%40timestamp%2C%2A&scroll=1m&size=10000 [status:500 request:0.004s]
ERROR:root:Error running query: TransportError(500, u'search_phase_execution_exception', u'Trying to create too many scroll contexts. Must be less than or equal to: [500]. This limit can be set by changing the [search.max_open_scroll_context] setting.')
INFO:elastalert:Ran Detection-of-SafetyKatz_0 from 2019-05-18 15:27 UTC to 2019-05-18 15:42 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
WARNING:elasticsearch:GET http://helk-elasticsearch:9200/logs-endpoint-winevent-system-*/_search?ignore_unavailable=true&_source_includes=%40timestamp%2C%2A&scroll=1m&size=10000 [status:500 request:0.008s]
ERROR:root:Error running query: TransportError(500, u'search_phase_execution_exception', u'Trying to create too many scroll contexts. Must be less than or equal to: [500]. This limit can be set by changing the [search.max_open_scroll_context] setting.')
INFO:elastalert:Ran Eventlog-Cleared_0 from 2019-05-18 15:27 UTC to 2019-05-18 15:42 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
WARNING:elasticsearch:GET http://helk-elasticsearch:9200/logs-endpoint-winevent-security-*/_search?ignore_unavailable=true&_source_includes=%40timestamp%2C%2A&scroll=1m&size=10000 [status:500 request:0.004s]
ERROR:root:Error running query: TransportError(500, u'search_phase_execution_exception', u'Trying to create too many scroll contexts. Must be less than or equal to: [500]. This limit can be set by changing the [search.max_open_scroll_context] setting.')
INFO:elastalert:Ran Remote-Task-Creation-via-ATSVC-named-pipe_0 from 2019-05-18 15:27 UTC to 2019-05-18 15:42 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
WARNING:elasticsearch:GET http://helk-elasticsearch:9200/logs-*/_search?ignore_unavailable=true&_source_includes=%40timestamp%2C%2A&scroll=1m&size=10000 [status:500 request:0.008s]
ERROR:root:Error running query: TransportError(500, u'search_phase_execution_exception', u'Trying to create too many scroll contexts. Must be less than or equal to: [500]. This limit can be set by changing the [search.max_open_scroll_context] setting.')
INFO:elastalert:Ran Usage-of-Sysinternals-Tools-2_0 from 2019-05-18 15:27 UTC to 2019-05-18 15:42 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
WARNING:elasticsearch:GET http://helk-elasticsearch:9200/logs-endpoint-winevent-security-*/_search?ignore_unavailable=true&_source_includes=%40timestamp%2C%2A&scroll=1m&size=10000 [status:500 request:0.004s]
ERROR:root:Error running query: TransportError(500, u'search_phase_execution_exception', u'Trying to create too many scroll contexts. Must be less than or equal to: [500]. This limit can be set by changing the [search.max_open_scroll_context] setting.')
INFO:elastalert:Ran Unauthorized-System-Time-Modification_0 from 2019-05-18 15:27 UTC to 2019-05-18 15:42 UTC: 0 query hits (0 already seen), 0 matches, 0 alerts sent
WARNING:elasticsearch:GET http://helk-elasticsearch:9200/logs-endpoint-winevent-security-*/_search?ignore_unavailable=true&_source_includes=%40timestamp%2C%2A&scroll=1m&size=10000 [status:500 request:0.004s]
Cyb3rWard0g commented 5 years ago

I dont know whats going on from an Elastalert perspective. I will open an issue in elasticsearch too and linked this

Cyb3rWard0g commented 5 years ago

This is one of the rules triggering the scroll errors:

alert:
- debug
description: Detects programs with network connections running in suspicious files
  system locations
filter:
- query:
    query_string:
      query: (event_id:"3" AND process_path.keyword:(*\\$Recycle.bin *\\Users\\All\
        Users\\* *\\Users\\Default\\* *\\Users\\Public\\* *\\Users\\Contacts\\* *\\Users\\Searches\\*
        C\:\\Perflogs\\* *\\config\\systemprofile\\* *\\Windows\\Fonts\\* *\\Windows\\IME\\*
        *\\Windows\\addins\\*))
index: logs-endpoint-winevent-sysmon-*
name: Suspicious-Program-Location-with-Network-Connections_0
priority: 2
realert:
  minutes: 0
type: any
Cyb3rWard0g commented 5 years ago

I also opened an issue/topic in Elastic Discuss : https://discuss.elastic.co/t/elasticsearch-7-0-1-trying-to-create-too-many-scroll-contexts-must-be-less-than-or-equal-to-500/181687

Qmando commented 5 years ago

Thanks for posting all this info. I would definitely recommend against wildcard indexes if you're using time based index names.

The default scroll_keepalive should be 30 seconds: https://github.com/Yelp/elastalert/blob/master/elastalert/config.py#L477 so it's odd that changing it to 1m would fix things.

arber-od commented 5 years ago

same error with ELK 7.1.0 and elastalert>=0.2.0b

May 22 12:52:14 elastalert node[7645]: {"name":"elastalert-server","hostname":"elastalert","pid":7645,"level":50,"msg":"ProcessController: WARNING:elasticsearch:GET http://pbxkibana:9200/pbxes-*/_search?ignore_unavailable=true&_source_includes=%40timestamp%2C%2A&scroll=60s&size=10000 [status:500 request:0.032s]\n","time":"2019-05-22T12:52:14.992Z","v":0}

May 22 12:52:14 elastalert node[7645]: {"name":"elastalert-server","hostname":"elastalert","pid":7645,"level":50,"msg":"ProcessController: ERROR:root:Error running query: TransportError(500, u'search_phase_execution_exception', u'Trying to create too many scroll contexts. Must be less than or equal to: [1000]. This limit can be set by changing the [search.max_open_scroll_context] setting.')\n","time":"2019-05-22T12:52:14.993Z","v":0}

May 22 12:53:10 elastalert node[7645]: {"name":"elastalert-server","hostname":"elastalert","pid":7645,"level":50,"msg":"ProcessController: WARNING:elasticsearch:GET http://pbxkibana:9200/pbxes-*/_search?ignore_unavailable=true&_source_includes=%40timestamp%2C%2A&scroll=60s&size=10000 [status:500 request:0.033s]\n","time":"2019-05-22T12:53:10.304Z","v":0}

May 22 12:53:10 elastalert node[7645]: {"name":"elastalert-server","hostname":"elastalert","pid":7645,"level":50,"msg":"ProcessController: ERROR:root:Error running query: TransportError(500, u'search_phase_execution_exception', u'Trying to create too many scroll contexts. Must be less than or equal to: [1000]. This limit can be set by changing the [search.max_open_scroll_context] setting.')\n","time":"2019-05-22T12:53:10.305Z","v":0}

tryed to change my config.py but nothing conf.setdefault('writeback_alias', 'elastalert_alerts') conf.setdefault('max_query_size', 1000) conf.setdefault('scroll_keepalive', '60s') conf.setdefault('disable_rules_on_error', True) conf.setdefault('scan_subdirectories', True) conf.setdefault('rules_loader', 'file')