Yelp / elastalert

Easy & Flexible Alerting With ElasticSearch
https://elastalert.readthedocs.org
Apache License 2.0
7.99k stars 1.73k forks source link

[Question] How much memory to run ElastAlert #3176

Closed Guerout-Arnaud closed 3 years ago

Guerout-Arnaud commented 3 years ago

Hi, I'm currently working on a server monitoring system and I'm planning to use an alerting system. Unfortunately I'm limited in terms of memory and disk. Could anyone tell me how much RAM and Storage I need to have a properly working ElastAlert without having to worry about it? Thanks by advance

heiderich commented 3 years ago

I experienced elastalert dying several times due to OOM in the past. This mostly happened in situations where it had to deal with a larger amount of logs. So depending on the number of rules and the number of log messages you expect I would recommend to reserve quite a bit of memory for it. I went as far as to monitor elastalert by another monitoring system to assure that it is running.

As far as disk space is concerned, are you referring to elastalert here or rather elasticsearch? In the latter case it mostly depends on the amount of logs that you want to keep and the number of replicas in your ES cluster.

Guerout-Arnaud commented 3 years ago

Alright thanks :D Any idea of how much memory you allocated while experiencing some OOM this would really help me a lot having a first idea of how much i need to have. Thanks by advance

heiderich commented 3 years ago

I used a few GB (maybe 8?). In my experience the amount of memory you need depends a lot on the number of log messages per time and your rules. So it is difficult to give a standard answer here without knowing more about your rules and the number of logs.

Guerout-Arnaud commented 3 years ago

Thanks. Unfortunately this can be an issue then. Actually i'm planning to have at least 17 servers under metricbeat control with cpu, memory, disk and network monitoring + filebeat control for half of them (idk yet what kind of monitoring my team need for this part) An this is just the minimal for most important servers.

Guerout-Arnaud commented 3 years ago

@heiderich Do you rember how much logs you were processing and what type of rules you've implemented ? It would allow me (an other users) to have a reference point on how much memory is needed to process X logs

nsano-rururu commented 3 years ago

Past issues that may help Reduce memory usage without restart elastalert? #2931

nsano-rururu commented 3 years ago

Elastic StackによるKubernetesモニタリングシステムの紹介 Kubernetesに移行中のfreee、セキュリティとモニタリングを語る freeeにおける Kubernetes監視基盤

Guerout-Arnaud commented 3 years ago

Thanks for trying to help but unfortunately i do not speak japanese Could you please try to make a sythesis out of these documents ?

nsano-rururu commented 3 years ago

so what? .. All you have to do is paste the written characters on the translation site.

nsano-rururu commented 3 years ago

Disadvantages of ElastAlert

・ Currently not maintained at all ・ Many bugs ・ Alerts to LineNotify, Zabbix, Stomp, Pagertree do not work ・ In many cases, even if you ask an issue, you will not get an answer. ・ No response even if it is a pull request ・ Even if the rule is disabled, it will not be invalidated. ElastAlert will reboot ・ Even if the rule is deleted, it may remain. ElastAlert will reboot ・ The document has not been updated. ・ It does not work in the environment of Python 3.9. ・ There is no guarantee of operation after Python 3.6. Should work with 3.7 and 3.8

nsano-rururu commented 3 years ago

One master node for k8s. I have experimented with 6 worker nodes to collect Metricbeat, application logs, etc. with Filebeat, Fluentd, and life and death monitoring with Heatbeat in Elasticsearch.

Sure, I needed t3.medium (2 CPUs, memory 4G) for AWS EC2 at first, and t3.xlarge (4 CPUs, memory 16G) two months later. The disc should have been about 20G at first and finally about 60G. Metricbeat should have been a collection every 30 minutes. Metricbeat information squeezed the disc. ElastAlert stopped working properly when the amount of data in Elasticsearch increased.

In Kibana, I set Index Lifecycle Management (ILM) so that I could erase unnecessary data without accumulating too much, and it worked comfortably. It's just an experimental result, so it may not be very helpful.

Guerout-Arnaud commented 3 years ago

The example of configuration feel enough to me to make an idea. Thanks everyone