Yelp / elastalert

Easy & Flexible Alerting With ElasticSearch
https://elastalert.readthedocs.org
Apache License 2.0
7.99k stars 1.73k forks source link

[Jhipster-Alerter] Huge memory consumption and process killed #2481

Open trixprod opened 5 years ago

trixprod commented 5 years ago

Hello, i have posted this issue to the jhipster alerter github but i think it might be more elastalert related.

Elastalert version : elastalert 0.1.36 ( i can't use newer version as i use latest jhipster-alerter image ).

Well, i recently observed some strange behaviour of jhipster-alerter, it take all my memory !

MEM %               CPU %               MEM USAGE      NAME
22.34%              99.80%              14.05GiB   jhipster-alerter

If i watch the log, the process is killed once memory is full then the process is relaunched. The process can run several hours without any issue and suddently take all my memory.

INFO:elastalert:Queried rule log_error_email from 2019-09-18 14:08 UTC to 2019-09-18 14:13 UTC: 21 / 21 hits
INFO:elastalert:Queried rule log_error_email from 2019-09-18 14:13 UTC to 2019-09-18 14:18 UTC: 1 / 1 hits
INFO:elastalert:Queried rule log_error_email from 2019-09-18 14:18 UTC to 2019-09-18 14:23 UTC: 106 / 106 hits
INFO:elastalert:Queried rule log_error_email from 2019-09-18 14:23 UTC to 2019-09-18 14:28 UTC: 1 / 1 hits
INFO:elastalert:Queried rule log_error_email from 2019-09-18 14:28 UTC to 2019-09-18 14:33 UTC: 0 / 0 hits
INFO:elastalert:Queried rule log_error_email from 2019-09-18 14:33 UTC to 2019-09-18 14:38 UTC: 0 / 0 hits
INFO:elastalert:Queried rule log_error_email from 2019-09-18 14:38 UTC to 2019-09-18 14:43 UTC: 0 / 0 hits
INFO:elastalert:Queried rule log_error_email from 2019-09-18 14:43 UTC to 2019-09-18 14:48 UTC: 0 / 0 hits
INFO:elastalert:Queried rule log_error_email from 2019-09-18 14:48 UTC to 2019-09-18 14:53 UTC: 0 / 0 hits
INFO:elastalert:Queried rule log_error_email from 2019-09-18 14:53 UTC to 2019-09-18 14:58 UTC: 0 / 0 hits
INFO:elastalert:Queried rule log_error_email from 2019-09-18 14:58 UTC to 2019-09-18 15:03 UTC: 2 / 2 hits
INFO:elastalert:Queried rule log_error_email from 2019-09-18 15:03 UTC to 2019-09-18 15:08 UTC: 34 / 34 hits
INFO:elastalert:Queried rule log_error_email from 2019-09-18 15:08 UTC to 2019-09-18 15:13 UTC: 0 / 0 hits
INFO:elastalert:Queried rule log_error_email from 2019-09-18 15:13 UTC to 2019-09-18 15:18 UTC: 0 / 0 hits
INFO:elastalert:Queried rule log_error_email from 2019-09-18 15:18 UTC to 2019-09-18 15:23 UTC: 2 / 2 hits
INFO:elastalert:Queried rule log_error_email from 2019-09-18 15:23 UTC to 2019-09-18 15:28 UTC: 0 / 0 hits
INFO:elastalert:Queried rule log_error_email from 2019-09-18 15:28 UTC to 2019-09-18 15:33 UTC: 0 / 0 hits
INFO:elastalert:Queried rule log_error_email from 2019-09-18 15:33 UTC to 2019-09-18 15:38 UTC: 0 / 0 hits
INFO:elastalert:Queried rule log_error_email from 2019-09-18 15:38 UTC to 2019-09-18 15:43 UTC: 0 / 0 hits
INFO:elastalert:Queried rule log_error_email from 2019-09-18 15:43 UTC to 2019-09-18 15:48 UTC: 128 / 128 hits
INFO:elastalert:Queried rule log_error_email from 2019-09-18 15:48 UTC to 2019-09-18 15:49 UTC: 0 / 0 hits
**/opt/start-elastalert.sh: line 36:    13 Killed python -m elastalert.elastalert --verbose**
Waiting for Elasticsearch to startup (max 5min)
{"cluster_name":"docker-cluster","status":"yellow","timed_out":false,"number_of_nodes":1,"number_of_data_nodes":1,"active_primary_shards":86,"active_shards":86,"relocating_shards":0,"initializing_shards":0,"unassigned_shards":85,"delayed_unassigned_shard
s":0,"number_of_pending_tasks":0,"number_of_in_flight_fetch":0,"task_max_waiting_in_queue_millis":0,"active_shards_percent_as_number":50.29239766081871}
Starting Alerting
Container timezone not modified
Elastalert index already exists in ES.
INFO:elastalert:Starting up
INFO:elastalert:Queried rule log_tgtg_email from 2019-09-18 15:49 UTC to 2019-09-18 15:54 UTC: 0 / 0 hits
INFO:elastalert:Queried rule log_tgtg_email from 2019-09-18 15:54 UTC to 2019-09-18 15:59 UTC: 0 / 0 hits
INFO:elastalert:Queried rule log_tgtg_email from 2019-09-18 15:59 UTC to 2019-09-18 16:04 UTC: 0 / 0 hits

Here is my log_error_email rule

# Simple alert that is triggered by ERROR logs
es_host: jhipster-elasticsearch
es_port: 9200
name: log_error_email
type: frequency
index: logs-*
# link to a kibana dashboard with correct time settings
use_kibana4_dashboard: "http://localhost:5601/app/kibana#/dashboard/d712f650-e0eb-11e7-9c68-0b9a0f0c183c"
num_events: 1
timeframe:
    minutes: 1
filter:
- query:
    query_string:
query: "level:ERROR"
alert:
- email
alert_subject: "{0} : {1} {2}"
alert_subject_args:
- app_name
- message
- "@timestamp"
email: "xxxx@xxx@gmail.com"
smtp_host: "xxxx.amazonaws.com"
smtp_port: 587
from_addr: "xxxx@xxx.com"
smtp_auth_file: "/opt/elastalert/smtp/smtp_auth_file.yaml"

And my config file

# This is the folder that contains the rule yaml files
# Any .yaml file will be loaded as a rule
rules_folder: /opt/elastalert/rules

# How often ElastAlert will query Elasticsearch
# The unit can be anything from weeks to seconds
run_every:
  minutes: 1

# ElastAlert will buffer results from the most recent
# period of time, in case some log sources are not in real time
buffer_time:
  minutes: 5

# The index on es_host which is used for metadata storage
# This can be a unmapped index, but it is recommended that you run
# elastalert-create-index to set a mapping
writeback_index: alerts

# If an alert fails for some reason, ElastAlert will retry
# sending the alert until this time period has elapsed
alert_time_limit:
  days: 1

I have no idea where is the issue :/

caleb15 commented 5 years ago

We are also having an issue with elastalert taking up a bunch of memory. elastalert==0.1.39 ubuntu 18 elasticsearch 6.8