wazuh / wazuh

Wazuh - The Open Source Security Platform. Unified XDR and SIEM protection for endpoints and cloud workloads.
https://wazuh.com/
Other
10.17k stars 1.56k forks source link

Wazuh API non reponsive #5872

Closed akshatgit closed 3 years ago

akshatgit commented 3 years ago
Wazuh version Component Install type Install method Platform
v3.13.1 Wazuh API Manager Puppet Debian 10

The API becomes unresponsive in almost every 6 hours. Logs from API.

WazuhAPI 2020-09-01 09:37:16 xxx: [::ffff:172.20.10.92] GET /manager/info? - 200 - error: '1017'.

Error from Kibana App:

Error

3099 - ERROR3099 - Some Wazuh daemons are not ready in node 'node01' (wazuh-modulesd->stopped)

The only way to fix this is by restarting the wazuh-manager.

JcabreraC commented 3 years ago

Hello @akshatgit

It seems that the problem is that wazuh-modulesd is stopping. Could you find out which error is showing up in the ossec.log file ?

To do this, run it in the manager:

cat /var/ossec/logs/ossec.log | grep ERROR
akshatgit commented 3 years ago

Three types of error are there:

I'll look into maild related errors.. are any of these are critical errors?

JcabreraC commented 3 years ago

Hello,

The errors shown are not related to modulesd. Could you paste the manager's ossec.conf file to see the enabled modules?

akshatgit commented 3 years ago
<ossec_config>
  <global>
    <jsonout_output>yes</jsonout_output>
    <alerts_log>yes</alerts_log>
    <logall>no</logall>
    <logall_json>no</logall_json>
    <email_notification>yes</email_notification>
    <smtp_server>localhost</smtp_server>
    <email_from>xxx-manager-xxx@xxx.net</email_from>
    <email_to>xxx-sec@xxx.net</email_to>
    <email_maxperhour>20</email_maxperhour>
    <email_log_source>alerts.log</email_log_source>
  </global>
  <alerts>
    <log_alert_level>1</log_alert_level>
    <email_alert_level>1</email_alert_level>
  </alerts>
  <!-- Choose between "plain", "json", or "plain,json" for the format of internal logs -->
  <logging>
    <log_format>plain</log_format>
  </logging>
  <remote>
    <connection>secure</connection>
    <port>1514</port>
    <protocol>udp</protocol>
    <queue_size>131072</queue_size>
  </remote>
  <!-- Policy monitoring -->
  <rootcheck>
    <disabled>no</disabled>
    <check_files>yes</check_files>
    <check_trojans>yes</check_trojans>
    <check_dev>yes</check_dev>
    <check_sys>yes</check_sys>
    <check_pids>yes</check_pids>
    <check_ports>yes</check_ports>

    <check_if>yes</check_if>
    <!-- Frequency that rootcheck is executed - every 12 hours -->
udo
   <frequency>43200</frequency>
    <rootkit_files>/var/ossec/etc/rootcheck/rootkit_files.txt</rootkit_files>
    <rootkit_trojans>/var/ossec/etc/rootcheck/rootkit_trojans.txt</rootkit_trojans>
    <skip_nfs>yes</skip_nfs>
  </rootcheck>
  <wodle name="open-scap">
    <disabled>yes</disabled>
    <timeout>1800</timeout>
    <interval>1d</interval>
    <scan-on-start>yes</scan-on-start>
  </wodle>
  <wodle name="cis-cat">
    <disabled>yes</disabled>
    <timeout>1800</timeout>
    <interval>1d</interval>
    <scan-on-start>yes</scan-on-start>
    <java_path>wodles/java</java_path>
    <ciscat_path>wodles/ciscat</ciscat_path>
  </wodle>
  <!-- Osquery integration -->
  <wodle name="osquery">
    <disabled>yes</disabled>
    <run_daemon>yes</run_daemon>
    <log_path>/var/log/osquery/osqueryd.results.log</log_path>
    <config_path>/etc/osquery/osquery.conf</config_path>
    <add_labels>yes</add_labels>
  </wodle>
  <!-- System inventory -->
  <wodle name="syscollector">
    <disabled>no</disabled>
    <interval>1h</interval>
    <scan_on_start>yes</scan_on_start>
    <hardware>yes</hardware>
    <os>yes</os>
    <network>yes</network>
    <packages>yes</packages>
    <ports all="no">yes</ports>
    <processes>yes</processes>
  </wodle>
  <sca>
    <enabled>yes</enabled>
    <scan_on_start>yes</scan_on_start>
    <interval>12h</interval>
    <skip_nfs>yes</skip_nfs>
  </sca>
  <vulnerability-detector>
    <enabled>yes</enabled>
    <interval>5m</interval>
    <run_on_start>yes</run_on_start>
    <provider name="canonical">
      <enabled>no</enabled>
      <os>precise</os>
      <os>trusty</os>
      <os>xenial</os>
      <os>bionic</os>
      <update_interval>1h</update_interval>
    </provider>
    <provider name="debian">
      <enabled>yes</enabled>
      <os>wheezy</os>
      <os>stretch</os>
      <os>jessie</os>
      <os>buster</os>
      <update_interval>1h</update_interval>
    </provider>
    <provider name="redhat">
      <enabled>no</enabled>
      <update_from_year>2010</update_from_year>
      <update_interval>1h</update_interval>
    </provider>
    <provider name="nvd">
      <enabled>yes</enabled>
      <update_from_year>2010</update_from_year>
      <update_interval>1h</update_interval>
    </provider>
  </vulnerability-detector>
  <!-- File integrity monitoring -->
  <syscheck>
    <disabled>no</disabled>
    <!-- Frequency that syscheck is executed default every 12 hours -->
    <frequency>43200</frequency>
    <scan_on_start>yes</scan_on_start>
    <!-- Generate alert when new file detected -->
    <alert_new_files>yes</alert_new_files>
    <!-- Don't ignore files that change more than 'frequency' times -->
    <auto_ignore frequency="10" timeframe="3600">no</auto_ignore>
    <!-- Directories to check  (perform all possible verifications) -->
    <directories check_all="yes">/etc,/usr/bin,/usr/sbin</directories>
    <directories check_all="yes">/bin,/sbin,/boot</directories>
    <!-- Files/directories to ignore -->
    <ignore>/etc/mtab</ignore>
    <ignore>/etc/hosts.deny</ignore>
    <ignore>/etc/mail/statistics</ignore>
    <ignore>/etc/random-seed</ignore>
    <ignore>/etc/random.seed</ignore>
    <ignore>/etc/adjtime</ignore>
    <ignore>/etc/httpd/logs</ignore>
    <ignore>/etc/utmpx</ignore>
    <ignore>/etc/wtmpx</ignore>
    <ignore>/etc/cups/certs</ignore>
    <ignore>/etc/dumpdates</ignore>
    <ignore>/etc/svc/volatile</ignore>
    <ignore>/sys/kernel/security</ignore>
    <ignore>/sys/kernel/debug</ignore>
    <ignore>/dev/core</ignore>
    <!-- File types to ignore -->
    <ignore type="sregex">^/proc</ignore>
    <ignore type="sregex">.log$|.swp$</ignore>
    <!-- Check the file, but never compute the diff -->
    <nodiff>/etc/ssl/private.key</nodiff>
    <skip_nfs>yes</skip_nfs>
  </syscheck>
  <!-- Active response -->
  <global>
    <white_list>127.0.0.1</white_list>
    <white_list>^localhost.localdomain$</white_list>
    <white_list>x.y.z.16</white_list>
    <white_list>x.y.z.30</white_list>
    <white_list>nil</white_list>
  </global>
  <command>
    <name>disable-account</name>
    <executable>disable-account.sh</executable>
    <expect>user</expect>
    <timeout_allowed>yes</timeout_allowed>
  </command>
  <command>
    <name>restart-ossec</name>
    <executable>restart-ossec.sh</executable>
    <expect/>
  </command>
  <command>
    <name>firewall-drop</name>
    <executable>firewall-drop.sh</executable>
    <expect>srcip</expect>
    <timeout_allowed>yes</timeout_allowed>
  </command>
  <command>
    <name>host-deny</name>
    <executable>host-deny.sh</executable>
    <expect>srcip</expect>
    <timeout_allowed>yes</timeout_allowed>
  </command>
  <command>
    <name>route-null</name>
    <executable>route-null.sh</executable>
    <expect>srcip</expect>
    <timeout_allowed>yes</timeout_allowed>
  </command>
  <command>
    <name>win_route-null</name>
    <executable>route-null.cmd</executable>
    <expect>srcip</expect>
    <timeout_allowed>yes</timeout_allowed>
  </command>
  <command>
    <name>win_route-null-2012</name>
    <executable>route-null-2012.cmd</executable>
    <expect>srcip</expect>
    <timeout_allowed>yes</timeout_allowed>
  </command>
  <command>
    <name>netsh</name>
    <executable>netsh.cmd</executable>
    <expect>srcip</expect>
    <timeout_allowed>yes</timeout_allowed>
  </command>
  <command>
    <name>netsh-win-2016</name>
    <executable>netsh-win-2016.cmd</executable>
    <expect>srcip</expect>
    <timeout_allowed>yes</timeout_allowed>
  </command>
  <!--
      <active-response>
            active-response options here
      </active-response>
      -->
  <!-- Log analysis -->
  <localfile>
    <log_format>command</log_format>
    <command>df -P</command>
    <frequency>360</frequency>
  </localfile>
  <localfile>
    <log_format>full_command</log_format>
    <command>netstat -tulpn | sed 's/\([[:alnum:]]\+\)\ \+[[:digit:]]\+\ \+[[:digit:]]\+\ \+\(.*\):\([[:digit:]]*\)\ \+\([0-9\.\:\*]\+\).\+\ \([[:digit:]]*\/[[:alnum:]\-]*\).*/\1 \2 == \3 == \4 \5/' | sort -k 4 -g | sed 's/ == \(.*\) ==/:\1/' | sed 1,2d</command>
    <alias>netstat listening ports</alias>
    <frequency>360</frequency>
  </localfile>
  <localfile>
    <log_format>full_command</log_format>
    <command>last -n 20</command>
    <frequency>360</frequency>
  </localfile>
  <ruleset>
    <!-- Default ruleset -->
    <decoder_dir>ruleset/decoders</decoder_dir>
    <rule_dir>ruleset/rules</rule_dir>
    <rule_exclude>0215-policy_rules.xml</rule_exclude>
    <list>etc/lists/audit-keys</list>
    <list>etc/lists/amazon/aws-eventnames</list>
    <list>etc/lists/security-eventchannel</list>
    <!-- User-defined ruleset -->
    <decoder_dir>etc/decoders</decoder_dir>
    <rule_dir>etc/rules</rule_dir>
  </ruleset>
  <!-- Configuration for ossec-authd -->
  <auth>
    <disabled>no</disabled>
    <port>1515</port>
    <use_source_ip>yes</use_source_ip>
    <force_insert>yes</force_insert>
    <force_time>0</force_time>
    <purge>yes</purge>
    <use_password>no</use_password>
    <limit_maxagents>yes</limit_maxagents>
    <ciphers>XXXXX</ciphers>
    <!-- <ssl_agent_ca></ssl_agent_ca>
         -->
    <ssl_verify_host>no</ssl_verify_host>
    <ssl_manager_cert>/var/ossec/etc/sslmanager.cert</ssl_manager_cert>
    <ssl_manager_key>/var/ossec/etc/sslmanager.key</ssl_manager_key>
    <ssl_auto_negotiate>no</ssl_auto_negotiate>
  </auth>
  <cluster>
    <name>wazuh</name>
    <node_name>node01</node_name>
    <node_type>master</node_type>
    <key/>
    <port>1516</port>
    <bind_addr>0.0.0.0</bind_addr>
    <nodes>
      <node>NODE_IP</node>
    </nodes>
    <hidden>no</hidden>
    <disabled>yes</disabled>
  </cluster>
</ossec_config>
<ossec_config>
  <localfile>
    <log_format>syslog</log_format>
    <location>/var/ossec/logs/active-responses.log</location>
  </localfile>
  <localfile>
    <log_format>syslog</log_format>
    <location>/var/log/messages</location>
  </localfile>
  <localfile>
    <log_format>syslog</log_format>
    <location>/var/log/auth.log</location>
  </localfile>
  <localfile>
    <log_format>syslog</log_format>
    <location>/var/log/syslog</location>
  </localfile>
  <localfile>
    <log_format>syslog</log_format>
    <location>/var/log/dpkg.log</location>
  </localfile>
  <localfile>
    <log_format>syslog</log_format>
    <location>/var/log/kern.log</location>
  </localfile>
</ossec_config>
<!-- Configuration for Reports -->
<ossec_config>
  <reports>
    <category>syscheck</category>
    <title>Daily Report: File changes</title>
    <email_to>xxx-sec@xxx.net</email_to>
  </reports>
  <reports>
    <level>1</level>
    <title>Daily Report: Alerts with level higher than 1</title>
    <email_to>xxx-sec@xxx.net</email_to>
  </reports>
  <reports>
    <level>2</level>
    <title>Daily Report: Alerts with level higher than 2</title>
    <email_to>xxx-sec@xxx.net</email_to>
  </reports>
  <reports>
    <level>3</level>
    <title>Daily Report: Alerts with level higher than 3</title>
    <email_to>xxx-sec@xxx.net</email_to>
  </reports>
  <reports>
    <level>4</level>
    <title>Daily Report: Alerts with level higher than 4</title>
    <email_to>xxx-sec@xxx.net</email_to>
  </reports>
  <reports>
    <level>5</level>
    <title>Daily Report: Alerts with level higher than 5</title>
    <email_to>xxx-sec@xxx.net</email_to>
  </reports>
  <reports>
    <level>6</level>
    <title>Daily Report: Alerts with level higher than 6</title>
    <email_to>xxx-sec@xxx.net</email_to>
  </reports>
  <reports>
    <level>7</level>
    <title>Daily Report: Alerts with level higher than 7</title>
    <email_to>xxx-sec@xxx.net</email_to>
  </reports>
  <reports>
    <level>8</level>
    <title>Daily Report: Alerts with level higher than 8</title>
    <email_to>xxx-sec@xxx.net</email_to>
  </reports>
  <reports>
    <level>9</level>
    <title>Daily Report: Alerts with level higher than 9</title>
    <email_to>xxx-sec@xxx.net</email_to>
  </reports>
  <reports>
    <level>10</level>
    <title>Daily Report: Alerts with level higher than 10</title>
    <email_to>xxx-sec@xxx.net</email_to>
  </reports>
  <reports>
    <level>11</level>
    <title>Daily Report: Alerts with level higher than 11</title>
    <email_to>xxx-sec@xxx.net</email_to>
  </reports>
  <reports>
    <level>12</level>
    <title>Daily Report: Alerts with level higher than 12</title>
    <email_to>xxx-sec@xxx.net</email_to>
  </reports>
  <reports>
    <level>13</level>
    <title>Daily Report: Alerts with level higher than 13</title>
    <email_to>xxx-sec@xxx.net</email_to>
  </reports>
  <reports>
    <level>14</level>
    <title>Daily Report: Alerts with level higher than 14</title>
    <email_to>xxx-sec@xxx.net</email_to>
  </reports>
  <reports>
    <level>15</level>
    <title>Daily Report: Alerts with level higher than 15</title>
    <email_to>xxx-sec@xxx.net</email_to>
  </reports>
  <email_alerts>
    <email_to>xxx-sec@xxx.net</email_to>
    <rule_id>100001</rule_id>
    <do_not_delay/>
  </email_alerts>
  <email_alerts>
    <email_to>xxx-sec@xxx.net</email_to>
    <rule_id>100002</rule_id>
    <do_not_delay/>
  </email_alerts>
  <email_alerts>
    <email_to>xxx-sec@xxx.net</email_to>
    <rule_id>100003</rule_id>
    <do_not_delay/>
  </email_alerts>
  <email_alerts>
    <email_to>xxx-sec@xxx.net</email_to>
    <rule_id>100004</rule_id>
    <do_not_delay/>
  </email_alerts>
  <email_alerts>
    <email_to>xxx-sec@xxx.net</email_to>
    <rule_id>100005</rule_id>
    <do_not_delay/>
  </email_alerts>
  <email_alerts>
    <email_to>xxx-sec@xxx.net</email_to>
    <rule_id>100006</rule_id>
    <do_not_delay/>
  </email_alerts>
  <email_alerts>
    <email_to>xxx-sec@xxx.net</email_to>
    <rule_id>100007</rule_id>
    <do_not_delay/>
  </email_alerts>

  <command>
    <name>wazuh_alert_fim_php_js</name>
    <executable>wazuh_alert_fim_php_js_wrapper.sh</executable>
    <expect>filename</expect>
    <timeout_allowed>no</timeout_allowed>
  </command>
  <command>
    <name>wazuh_alert_hba</name>
    <executable>wazuh_hba_cm.py</executable>
    <timeout_allowed>no</timeout_allowed>
  </command>
  <command>
    <name>wazuh_alert_vuln</name>
    <executable>wazuh_vuln_cm.py</executable>
    <timeout_allowed>no</timeout_allowed>
  </command>
  <active-response>
    <command>wazuh_alert_fim_php_js</command>
    <location>local</location>
    <rules_id>100002</rules_id>
  </active-response>
  <active-response>
    <command>wazuh_alert_fim_php_js</command>
    <location>local</location>
    <rules_id>100003</rules_id>
  </active-response>
  <active-response>
    <command>wazuh_alert_fim_php_js</command>
    <location>local</location>
    <rules_id>100004</rules_id>
  </active-response>
  <active-response>
    <command>wazuh_alert_fim_php_js</command>
    <location>local</location>
    <rules_id>100005</rules_id>
  </active-response>
  <active-response>
    <command>wazuh_alert_fim_php_js</command>
    <location>local</location>
    <rules_id>100006</rules_id>
  </active-response>
  <active-response>
    <command>wazuh_alert_fim_php_js</command>
    <location>local</location>
    <rules_id>100007</rules_id>
  </active-response>
  <active-response>
    <command>wazuh_alert_hba</command>
    <location>local</location>
    <rules_id>510</rules_id>
  </active-response>
  <active-response>
    <command>wazuh_alert_vuln</command>
    <location>local</location>
    <rules_id>23504</rules_id>
  </active-response>

  <integration>
    <name>custom-integration</name>
    <hook_url>https://api.flock.com/hooks/sendMessage/xxx</hook_url>
    <level>1</level>
    <alert_format>json</alert_format>
    <api_key>xxx</api_key>
  </integration>
  <integration>
    <name>ci_test</name>
    <rule_id>40104</rule_id>
    <alert_format>json</alert_format>
  </integration>

</ossec_config>
akshatgit commented 3 years ago

One more issue:

I have setup SNMP as mentioned in the doc, but I'm getting the following errors:

2020/09/01 12:28:16 ossec-maild: ERROR: (1762): Banner not received from server
2020/09/01 12:28:16 ossec-maild: ERROR: (1223): Error Sending email to 127.0.0.1 (smtp server)

I am able to test the configuration:

echo "Test mail from postfix" | mail -s "Test Postfix" -r "you@example.com" you@example.com
JcabreraC commented 3 years ago

Hello,

I can't reproduce your problem. It would be helpful if you could provide more information, like system logs.

To do this, copy the content from the /var/log/syslog.log file and run the demesg command after the error.

akshatgit commented 3 years ago

OOM killed. 😅 should have checked dmesg in the beginning. :(

[Tue Sep  1 08:40:34 2020] Out of memory: Kill process 24688 (ossec-monitord) score 324 or sacrifice child
[Tue Sep  1 08:40:34 2020] Killed process 24688 (ossec-monitord) total-vm:10725072kB, anon-rss:10664448kB, file-rss:1564kB, shmem-rss:0kB
[Tue Sep  1 08:40:35 2020] oom_reaper: reaped process 24688 (ossec-monitord), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
akshatgit commented 3 years ago

I would like to monitor the app metrics of wazuh. Is any prometheus exporter available?

akshatgit commented 3 years ago

Manager VM has 32G, 8core configuration. IMO this is quite overprovisioned given that we are only monitoring 25 agents currently and ELK stack runs on separate boxes. Am I missing any memory configuration?

JcabreraC commented 3 years ago

Okay, let's analyze what's happening.

akshatgit commented 3 years ago
wazuh_modules.debug=2
monitord.debug=2

will update once log is generated.

akshatgit commented 3 years ago

OOM reaper related logs

2020/09/07 00:35:13 wazuh-modulesd:syscollector[31343] syscollector_linux.c:1513 at sys_proc_linux(): DEBUG: sys_proc_linux() sending '{"type":"process","ID":838054248,"timestamp":"2020/09/07 00:35:13","process":{"pid":53,"name":"oom_reaper","state":"S","ppid":2,"utime":0,"stime":1815,"euser":"root","ruser":"root","suser":"root","egroup":"root","rgroup":"root","sgroup":"root","fgroup":"root","priority":20,"nice":0,"size":0,"vm_size":0,"resident":0,"share":0,"start_time":116,"pgrp":0,"session":0,"nlwp":1,"tgid":53,"tty":0,"processor":4}}'
2020/09/07 01:35:13 wazuh-modulesd:syscollector[31343] syscollector_linux.c:1513 at sys_proc_linux(): DEBUG: sys_proc_linux() sending '{"type":"process","ID":1014821396,"timestamp":"2020/09/07 01:35:12","process":{"pid":53,"name":"oom_reaper","state":"S","ppid":2,"utime":0,"stime":1901,"euser":"root","ruser":"root","suser":"root","egroup":"root","rgroup":"root","sgroup":"root","fgroup":"root","priority":20,"nice":0,"size":0,"vm_size":0,"resident":0,"share":0,"start_time":116,"pgrp":0,"session":0,"nlwp":1,"tgid":53,"tty":0,"processor":7}}'
2020/09/07 02:35:14 wazuh-modulesd:syscollector[31343] syscollector_linux.c:1513 at sys_proc_linux(): DEBUG: sys_proc_linux() sending '{"type":"process","ID":452363453,"timestamp":"2020/09/07 02:35:13","process":{"pid":53,"name":"oom_reaper","state":"S","ppid":2,"utime":0,"stime":1901,"euser":"root","ruser":"root","suser":"root","egroup":"root","rgroup":"root","sgroup":"root","fgroup":"root","priority":20,"nice":0,"size":0,"vm_size":0,"resident":0,"share":0,"start_time":116,"pgrp":0,"session":0,"nlwp":1,"tgid":53,"tty":0,"processor":7}}'
2020/09/07 03:35:13 wazuh-modulesd:syscollector[31343] syscollector_linux.c:1513 at sys_proc_linux(): DEBUG: sys_proc_linux() sending '{"type":"process","ID":20684206,"timestamp":"2020/09/07 03:35:13","process":{"pid":53,"name":"oom_reaper","state":"S","ppid":2,"utime":0,"stime":1901,"euser":"root","ruser":"root","suser":"root","egroup":"root","rgroup":"root","sgroup":"root","fgroup":"root","priority":20,"nice":0,"size":0,"vm_size":0,"resident":0,"share":0,"start_time":116,"pgrp":0,"session":0,"nlwp":1,"tgid":53,"tty":0,"processor":7}}'
2020/09/07 04:35:13 wazuh-modulesd:syscollector[31343] syscollector_linux.c:1513 at sys_proc_linux(): DEBUG: sys_proc_linux() sending '{"type":"process","ID":1299436842,"timestamp":"2020/09/07 04:35:13","process":{"pid":53,"name":"oom_reaper","state":"S","ppid":2,"utime":0,"stime":1901,"euser":"root","ruser":"root","suser":"root","egroup":"root","rgroup":"root","sgroup":"root","fgroup":"root","priority":20,"nice":0,"size":0,"vm_size":0,"resident":0,"share":0,"start_time":116,"pgrp":0,"session":0,"nlwp":1,"tgid":53,"tty":0,"processor":7}}'
2020/09/07 05:35:14 wazuh-modulesd:syscollector[31343] syscollector_linux.c:1513 at sys_proc_linux(): DEBUG: sys_proc_linux() sending '{"type":"process","ID":89883927,"timestamp":"2020/09/07 05:35:13","process":{"pid":53,"name":"oom_reaper","state":"S","ppid":2,"utime":0,"stime":1901,"euser":"root","ruser":"root","suser":"root","egroup":"root","rgroup":"root","sgroup":"root","fgroup":"root","priority":20,"nice":0,"size":0,"vm_size":0,"resident":0,"share":0,"start_time":116,"pgrp":0,"session":0,"nlwp":1,"tgid":53,"tty":0,"processor":7}}'
2020/09/07 06:35:13 wazuh-modulesd:syscollector[31343] syscollector_linux.c:1513 at sys_proc_linux(): DEBUG: sys_proc_linux() sending '{"type":"process","ID":2077498022,"timestamp":"2020/09/07 06:35:13","process":{"pid":53,"name":"oom_reaper","state":"S","ppid":2,"utime":0,"stime":1901,"euser":"root","ruser":"root","suser":"root","egroup":"root","rgroup":"root","sgroup":"root","fgroup":"root","priority":20,"nice":0,"size":0,"vm_size":0,"resident":0,"share":0,"start_time":116,"pgrp":0,"session":0,"nlwp":1,"tgid":53,"tty":0,"processor":7}}'
JcabreraC commented 3 years ago

Hello,

it looks like that the error lies in maild. One problem you have is that it's not properly parsing the alert which is going to be send afterward.

Let's try to solve this error by changing the configuration:

<email_log_source>alerts.log</email_log_source>

to

<email_log_source>alerts.json</email_log_source>

In addition, there is a problem that is being solved with the email_alert_level field. If the error persists after changing the alerts to .json format, try removing that line from the configuration. For more information, you can check out this thread: https://github.com/wazuh/wazuh/issues/5758

Regards, Juan

akshatgit commented 3 years ago

Thanks, @JcabreraC. I have updated the configuration, will monitor for errors. Before upgrading the masters to the latest version, <email_log_source>alerts.log</email_log_source> this configuration was working fine. One small question, how did you correlate the oom issue to maild?

JcabreraC commented 3 years ago

Hello @akshatgit ,

it wasn't due to OOM logs. I just kept trying until a found the cause.

Basically, I used all the information you provided plus the tests I had done until I got to the error.

I hope it worked.