StamusNetworks / SELKS

A Suricata based IDS/IPS/NSM distro
https://www.stamus-networks.com/open-source/#selks
GNU General Public License v3.0
1.46k stars 286 forks source link

Elastic search & Moloch crashes after 10 days of SELKS traffic #169

Open michal25 opened 5 years ago

michal25 commented 5 years ago

The SELKS device fails after 10 days of traffic. Screenshot_20190326_140653

~# systemctl status suricata elasticsearch logstash kibana evebox molochviewer-selks molochpcapread-selks ● suricata.service - LSB: Next Generation IDS/IPS Loaded: loaded (/etc/init.d/suricata; generated; vendor preset: enabled) Active: active (running) since Tue 2019-03-26 13:50:16 CET; 17min ago Docs: man:systemd-sysv-generator(8) Process: 609 ExecStart=/etc/init.d/suricata start (code=exited, status=0/SUCCESS) Tasks: 10 (limit: 4915) CGroup: /system.slice/suricata.service └─740 /usr/bin/suricata -c /etc/suricata/suricata.yaml --pidfile /var/run/suricata.pid --af-packet -D -v --user=logstash

Mar 26 13:50:16 SELKS2 systemd[1]: Starting LSB: Next Generation IDS/IPS... Mar 26 13:50:16 SELKS2 suricata[609]: Starting suricata in IDS (af-packet) mode... done. Mar 26 13:50:16 SELKS2 systemd[1]: Started LSB: Next Generation IDS/IPS.

● elasticsearch.service - Elasticsearch Loaded: loaded (/usr/lib/systemd/system/elasticsearch.service; enabled; vendor preset: enabled) Active: failed (Result: exit-code) since Tue 2019-03-26 14:04:05 CET; 3min 46s ago Docs: http://www.elastic.co Process: 623 ExecStart=/usr/share/elasticsearch/bin/elasticsearch -p ${PID_DIR}/elasticsearch.pid --quiet (code=exited, status=127) Main PID: 623 (code=exited, status=127) Tasks: 0 (limit: 4915) CGroup: /system.slice/elasticsearch.service

Mar 26 13:50:16 SELKS2 systemd[1]: Started Elasticsearch. Mar 26 14:04:05 SELKS2 systemd[1]: elasticsearch.service: Main process exited, code=exited, status=127/n/a Mar 26 14:04:05 SELKS2 systemd[1]: elasticsearch.service: Unit entered failed state. Mar 26 14:04:05 SELKS2 systemd[1]: elasticsearch.service: Failed with result 'exit-code'.

● logstash.service - logstash Loaded: loaded (/etc/systemd/system/logstash.service; enabled; vendor preset: enabled) Active: active (running) since Tue 2019-03-26 13:50:16 CET; 17min ago Main PID: 605 (java) Tasks: 36 (limit: 4915) CGroup: /system.slice/logstash.service └─605 /usr/bin/java -Xms1g -Xmx1g -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -Djava

Mar 26 14:07:28 SELKS2 logstash[605]: [2019-03-26T14:07:28,121][WARN ][logstash.outputs.elasticsearch] Attempted to resurrect connection to dead ES instance, but got Mar 26 14:07:32 SELKS2 logstash[605]: [2019-03-26T14:07:32,294][WARN ][logstash.outputs.elasticsearch] Attempted to resurrect connection to dead ES instance, but got Mar 26 14:07:33 SELKS2 logstash[605]: [2019-03-26T14:07:33,123][WARN ][logstash.outputs.elasticsearch] Attempted to resurrect connection to dead ES instance, but got Mar 26 14:07:37 SELKS2 logstash[605]: [2019-03-26T14:07:37,296][WARN ][logstash.outputs.elasticsearch] Attempted to resurrect connection to dead ES instance, but got Mar 26 14:07:38 SELKS2 logstash[605]: [2019-03-26T14:07:38,126][WARN ][logstash.outputs.elasticsearch] Attempted to resurrect connection to dead ES instance, but got Mar 26 14:07:42 SELKS2 logstash[605]: [2019-03-26T14:07:42,299][WARN ][logstash.outputs.elasticsearch] Attempted to resurrect connection to dead ES instance, but got Mar 26 14:07:43 SELKS2 logstash[605]: [2019-03-26T14:07:43,128][WARN ][logstash.outputs.elasticsearch] Attempted to resurrect connection to dead ES instance, but got Mar 26 14:07:47 SELKS2 logstash[605]: [2019-03-26T14:07:47,302][WARN ][logstash.outputs.elasticsearch] Attempted to resurrect connection to dead ES instance, but got Mar 26 14:07:48 SELKS2 logstash[605]: [2019-03-26T14:07:48,130][WARN ][logstash.outputs.elasticsearch] Attempted to resurrect connection to dead ES instance, but got Mar 26 14:07:52 SELKS2 logstash[605]: [2019-03-26T14:07:52,304][WARN ][logstash.outputs.elasticsearch] Attempted to resurrect connection to dead ES instance, but got

● kibana.service - Kibana Loaded: loaded (/etc/systemd/system/kibana.service; enabled; vendor preset: enabled) Active: active (running) since Tue 2019-03-26 13:50:48 CET; 17min ago Main PID: 1335 (node) Tasks: 11 (limit: 4915) CGroup: /system.slice/kibana.service └─1335 /usr/share/kibana/bin/../node/bin/node --no-warnings --max-http-header-size=65536 /usr/share/kibana/bin/../src/cli -c /etc/kibana/kibana.yml

Mar 26 14:07:44 SELKS2 kibana[1335]: Unhandled rejection Error: No Living connections Mar 26 14:07:44 SELKS2 kibana[1335]: at sendReqWithConnection (/usr/share/kibana/node_modules/elasticsearch/src/lib/transport.js:226:15) Mar 26 14:07:44 SELKS2 kibana[1335]: at next (/usr/share/kibana/node_modules/elasticsearch/src/lib/connection_pool.js:214:7) Mar 26 14:07:44 SELKS2 kibana[1335]: at process._tickCallback (internal/process/next_tick.js:61:11) Mar 26 14:07:46 SELKS2 kibana[1335]: {"type":"log","@timestamp":"2019-03-26T13:07:46Z","tags":["warning","elasticsearch","admin"],"pid":1335,"message":"Unable to rev Mar 26 14:07:46 SELKS2 kibana[1335]: {"type":"log","@timestamp":"2019-03-26T13:07:46Z","tags":["warning","elasticsearch","admin"],"pid":1335,"message":"No living con Mar 26 14:07:48 SELKS2 kibana[1335]: {"type":"log","@timestamp":"2019-03-26T13:07:48Z","tags":["warning","elasticsearch","admin"],"pid":1335,"message":"Unable to rev Mar 26 14:07:48 SELKS2 kibana[1335]: {"type":"log","@timestamp":"2019-03-26T13:07:48Z","tags":["warning","elasticsearch","admin"],"pid":1335,"message":"No living con Mar 26 14:07:51 SELKS2 kibana[1335]: {"type":"log","@timestamp":"2019-03-26T13:07:51Z","tags":["warning","elasticsearch","admin"],"pid":1335,"message":"Unable to rev Mar 26 14:07:51 SELKS2 kibana[1335]: {"type":"log","@timestamp":"2019-03-26T13:07:51Z","tags":["warning","elasticsearch","admin"],"pid":1335,"message":"No living con

● evebox.service - EveBox Server Loaded: loaded (/lib/systemd/system/evebox.service; enabled; vendor preset: enabled) Active: active (running) since Tue 2019-03-26 13:50:16 CET; 17min ago Main PID: 599 (evebox) Tasks: 9 (limit: 4915) CGroup: /system.slice/evebox.service └─599 /usr/bin/evebox server

Mar 26 13:50:34 SELKS2 evebox[599]: "minimum_index_compatibility_version" : "5.0.0" Mar 26 13:50:34 SELKS2 evebox[599]: }, Mar 26 13:50:34 SELKS2 evebox[599]: "tagline" : "You Know, for Search" Mar 26 13:50:34 SELKS2 evebox[599]: } Mar 26 13:50:37 SELKS2 evebox[599]: 2019-03-26 13:50:37 (server.go:353) -- Connected to Elastic Search (version: 6.6.2) Mar 26 13:50:37 SELKS2 evebox[599]: 2019-03-26 13:50:37 (elasticsearch.go:199) -- Found templates [logstash] Mar 26 13:50:37 SELKS2 evebox[599]: 2019-03-26 13:50:37 (elasticsearch.go:238) -- Found Elastic Search keyword suffix to be: keyword Mar 26 13:50:37 SELKS2 evebox[599]: 2019-03-26 13:50:37 (server.go:131) -- Session reaper started Mar 26 13:50:37 SELKS2 evebox[599]: 2019-03-26 13:50:37 (server.go:165) -- Authentication disabled. Mar 26 13:50:37 SELKS2 evebox[599]: 2019-03-26 13:50:37 (server.go:276) -- Listening on 0.0.0.0:5636

● molochviewer-selks.service - Moloch Viewer Loaded: loaded (/etc/systemd/system/molochviewer-selks.service; enabled; vendor preset: enabled) Active: active (running) since Tue 2019-03-26 13:51:48 CET; 16min ago Main PID: 1414 (sh) Tasks: 11 (limit: 4915) CGroup: /system.slice/molochviewer-selks.service ├─1414 /bin/sh -c /data/moloch/bin/node viewer.js -c /data/moloch/etc/config.ini >> /data/moloch/logs/viewer.log 2>&1 └─1415 /data/moloch/bin/node viewer.js -c /data/moloch/etc/config.ini

Mar 26 13:51:48 SELKS2 systemd[1]: Started Moloch Viewer.

● molochpcapread-selks.service - Moloch Pcap Read Loaded: loaded (/etc/systemd/system/molochpcapread-selks.service; enabled; vendor preset: enabled) Active: active (running) since Tue 2019-03-26 13:51:46 CET; 16min ago Main PID: 1408 (sh) Tasks: 6 (limit: 4915) CGroup: /system.slice/molochpcapread-selks.service ├─1408 /bin/sh -c /data/moloch/bin/moloch-capture -c /data/moloch/etc/config.ini -m --copy --delete -R /data/nsm/ >> /data/moloch/logs/capture.log 2>&1 └─1409 /data/moloch/bin/moloch-capture -c /data/moloch/etc/config.ini -m --copy --delete -R /data/nsm/

Mar 26 13:51:46 SELKS2 systemd[1]: Started Moloch Pcap Read.

I found this in the elasticsearch log

2019-03-23T19:19:47.561+0100: 440417.456: [Full GC (Allocation Failure) 2019-03-23T19:19:47.561+0100: 440417.456: [CMS: 707840K->707840K(707840K), 0.9491304 secs] 10 14527K->1014519K(1014528K), [Metaspace: 88813K->88813K(1134592K)], 0.9492004 secs] [Times: user=0.95 sys=0.00, real=0.95 secs] 2019-03-23T19:19:48.510+0100: 440418.405: Total time for which application threads were stopped: 0.9498420 seconds, Stopping threads took: 0.0001189 seconds 2019-03-23T19:19:48.511+0100: 440418.406: [Full GC (Allocation Failure) 2019-03-23T19:19:48.511+0100: 440418.406: [CMS: 707840K->707840K(707840K), 0.9474566 secs] 10 14527K->1014514K(1014528K), [Metaspace: 88813K->88813K(1134592K)], 0.9475432 secs] [Times: user=0.95 sys=0.00, real=0.95 secs] 2019-03-23T19:19:49.458+0100: 440419.353: Total time for which application threads were stopped: 0.9481405 seconds, Stopping threads took: 0.0000998 seconds 2019-03-23T19:19:49.459+0100: 440419.354: [Full GC (Allocation Failure) 2019-03-23T19:19:49.459+0100: 440419.354: [CMS: 707840K->707840K(707840K), 0.9445289 secs] 10 14528K->1014502K(1014528K), [Metaspace: 88813K->88813K(1134592K)], 0.9446219 secs] [Times: user=0.95 sys=0.00, real=0.94 secs] 2019-03-23T19:19:50.404+0100: 440420.299: Total time for which application threads were stopped: 0.9451441 seconds, Stopping threads took: 0.0000223 seconds 2019-03-23T19:19:50.405+0100: 440420.300: [Full GC (Allocation Failure) 2019-03-23T19:19:50.405+0100: 440420.300: [CMS: 707840K->707840K(707840K), 0.9606087 secs] 10 14528K->1014493K(1014528K), [Metaspace: 88813K->88813K(1134592K)], 0.9606822 secs] [Times: user=0.96 sys=0.00, real=0.97 secs] 2019-03-23T19:19:51.366+0100: 440421.261: Total time for which application threads were stopped: 0.9612547 seconds, Stopping threads took: 0.0000983 seconds 2019-03-23T19:19:51.367+0100: 440421.261: [Full GC (Allocation Failure) 2019-03-23T19:19:51.367+0100: 440421.261: [CMS: 707840K->707840K(707840K), 0.9444368 secs] 10 14528K->1014355K(1014528K), [Metaspace: 88815K->88815K(1134592K)], 0.9445107 secs] [Times: user=0.94 sys=0.00, real=0.94 secs] 2019-03-23T19:19:52.311+0100: 440422.206: Total time for which application threads were stopped: 0.9450178 seconds, Stopping threads took: 0.0000245 seconds 2019-03-23T19:19:52.313+0100: 440422.208: [Full GC (Allocation Failure) 2019-03-23T19:19:52.313+0100: 440422.208: [CMS: 707840K->707839K(707840K), 1.0560808 secs] 10 14528K->1000096K(1014528K), [Metaspace: 88813K->88813K(1134592K)], 1.0561568 secs] [Times: user=1.05 sys=0.00, real=1.06 secs] 2019-03-23T19:19:53.369+0100: 440423.264: Total time for which application threads were stopped: 1.0568064 seconds, Stopping threads took: 0.0001814 seconds Heap par new generation total 306688K, used 295047K [0x00000000c0000000, 0x00000000d4cc0000, 0x00000000d4cc0000) eden space 272640K, 100% used [0x00000000c0000000, 0x00000000d0a40000, 0x00000000d0a40000) from space 34048K, 65% used [0x00000000d0a40000, 0x00000000d2021cd0, 0x00000000d2b80000) to space 34048K, 0% used [0x00000000d2b80000, 0x00000000d2b80000, 0x00000000d4cc0000) concurrent mark-sweep generation total 707840K, used 707839K [0x00000000d4cc0000, 0x0000000100000000, 0x0000000100000000) Metaspace used 88822K, capacity 97314K, committed 97636K, reserved 1134592K class space used 11008K, capacity 13342K, committed 13468K, reserved 1048576K 2019-03-23T19:19:53.375+0100: 440423.270: [GC (CMS Initial Mark) [1 CMS-initial-mark: 707839K(707840K)] 1002887K(1014528K), 0.0603362 secs] [Times: user=0.07 sys=0.0 0, real=0.06 secs]

It looks, that the SELKS/elasticsearch crash has been on 23.3.2019

Sonda2-errors.tar.gz

michal25 commented 5 years ago

systemctl-status.txt

pevma commented 5 years ago

Coule of questions : Is there disk space ? Do you run out of mem (OOM) ?

-- Regards, Peter Manev

On 26 Mar 2019, at 14:17, michal25 notifications@github.com wrote:

The SELKS device fails after 10 days of traffic.

~# systemctl status suricata elasticsearch logstash kibana evebox molochviewer-selks molochpcapread-selks ● suricata.service - LSB: Next Generation IDS/IPS Loaded: loaded (/etc/init.d/suricata; generated; vendor preset: enabled) Active: active (running) since Tue 2019-03-26 13:50:16 CET; 17min ago Docs: man:systemd-sysv-generator(8) Process: 609 ExecStart=/etc/init.d/suricata start (code=exited, status=0/SUCCESS) Tasks: 10 (limit: 4915) CGroup: /system.slice/suricata.service └─740 /usr/bin/suricata -c /etc/suricata/suricata.yaml --pidfile /var/run/suricata.pid --af-packet -D -v --user=logstash

Mar 26 13:50:16 SELKS2 systemd[1]: Starting LSB: Next Generation IDS/IPS... Mar 26 13:50:16 SELKS2 suricata[609]: Starting suricata in IDS (af-packet) mode... done. Mar 26 13:50:16 SELKS2 systemd[1]: Started LSB: Next Generation IDS/IPS.

● elasticsearch.service - Elasticsearch Loaded: loaded (/usr/lib/systemd/system/elasticsearch.service; enabled; vendor preset: enabled) Active: failed (Result: exit-code) since Tue 2019-03-26 14:04:05 CET; 3min 46s ago Docs: http://www.elastic.co Process: 623 ExecStart=/usr/share/elasticsearch/bin/elasticsearch -p ${PID_DIR}/elasticsearch.pid --quiet (code=exited, status=127) Main PID: 623 (code=exited, status=127) Tasks: 0 (limit: 4915) CGroup: /system.slice/elasticsearch.service

Mar 26 13:50:16 SELKS2 systemd[1]: Started Elasticsearch. Mar 26 14:04:05 SELKS2 systemd[1]: elasticsearch.service: Main process exited, code=exited, status=127/n/a Mar 26 14:04:05 SELKS2 systemd[1]: elasticsearch.service: Unit entered failed state. Mar 26 14:04:05 SELKS2 systemd[1]: elasticsearch.service: Failed with result 'exit-code'.

● logstash.service - logstash Loaded: loaded (/etc/systemd/system/logstash.service; enabled; vendor preset: enabled) Active: active (running) since Tue 2019-03-26 13:50:16 CET; 17min ago Main PID: 605 (java) Tasks: 36 (limit: 4915) CGroup: /system.slice/logstash.service └─605 /usr/bin/java -Xms1g -Xmx1g -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -Djava

Mar 26 14:07:28 SELKS2 logstash[605]: [2019-03-26T14:07:28,121][WARN ][logstash.outputs.elasticsearch] Attempted to resurrect connection to dead ES instance, but got Mar 26 14:07:32 SELKS2 logstash[605]: [2019-03-26T14:07:32,294][WARN ][logstash.outputs.elasticsearch] Attempted to resurrect connection to dead ES instance, but got Mar 26 14:07:33 SELKS2 logstash[605]: [2019-03-26T14:07:33,123][WARN ][logstash.outputs.elasticsearch] Attempted to resurrect connection to dead ES instance, but got Mar 26 14:07:37 SELKS2 logstash[605]: [2019-03-26T14:07:37,296][WARN ][logstash.outputs.elasticsearch] Attempted to resurrect connection to dead ES instance, but got Mar 26 14:07:38 SELKS2 logstash[605]: [2019-03-26T14:07:38,126][WARN ][logstash.outputs.elasticsearch] Attempted to resurrect connection to dead ES instance, but got Mar 26 14:07:42 SELKS2 logstash[605]: [2019-03-26T14:07:42,299][WARN ][logstash.outputs.elasticsearch] Attempted to resurrect connection to dead ES instance, but got Mar 26 14:07:43 SELKS2 logstash[605]: [2019-03-26T14:07:43,128][WARN ][logstash.outputs.elasticsearch] Attempted to resurrect connection to dead ES instance, but got Mar 26 14:07:47 SELKS2 logstash[605]: [2019-03-26T14:07:47,302][WARN ][logstash.outputs.elasticsearch] Attempted to resurrect connection to dead ES instance, but got Mar 26 14:07:48 SELKS2 logstash[605]: [2019-03-26T14:07:48,130][WARN ][logstash.outputs.elasticsearch] Attempted to resurrect connection to dead ES instance, but got Mar 26 14:07:52 SELKS2 logstash[605]: [2019-03-26T14:07:52,304][WARN ][logstash.outputs.elasticsearch] Attempted to resurrect connection to dead ES instance, but got

● kibana.service - Kibana Loaded: loaded (/etc/systemd/system/kibana.service; enabled; vendor preset: enabled) Active: active (running) since Tue 2019-03-26 13:50:48 CET; 17min ago Main PID: 1335 (node) Tasks: 11 (limit: 4915) CGroup: /system.slice/kibana.service └─1335 /usr/share/kibana/bin/../node/bin/node --no-warnings --max-http-header-size=65536 /usr/share/kibana/bin/../src/cli -c /etc/kibana/kibana.yml

Mar 26 14:07:44 SELKS2 kibana[1335]: Unhandled rejection Error: No Living connections Mar 26 14:07:44 SELKS2 kibana[1335]: at sendReqWithConnection (/usr/share/kibana/node_modules/elasticsearch/src/lib/transport.js:226:15) Mar 26 14:07:44 SELKS2 kibana[1335]: at next (/usr/share/kibana/node_modules/elasticsearch/src/lib/connection_pool.js:214:7) Mar 26 14:07:44 SELKS2 kibana[1335]: at process._tickCallback (internal/process/next_tick.js:61:11) Mar 26 14:07:46 SELKS2 kibana[1335]: {"type":"log","@timestamp":"2019-03-26T13:07:46Z","tags":["warning","elasticsearch","admin"],"pid":1335,"message":"Unable to rev Mar 26 14:07:46 SELKS2 kibana[1335]: {"type":"log","@timestamp":"2019-03-26T13:07:46Z","tags":["warning","elasticsearch","admin"],"pid":1335,"message":"No living con Mar 26 14:07:48 SELKS2 kibana[1335]: {"type":"log","@timestamp":"2019-03-26T13:07:48Z","tags":["warning","elasticsearch","admin"],"pid":1335,"message":"Unable to rev Mar 26 14:07:48 SELKS2 kibana[1335]: {"type":"log","@timestamp":"2019-03-26T13:07:48Z","tags":["warning","elasticsearch","admin"],"pid":1335,"message":"No living con Mar 26 14:07:51 SELKS2 kibana[1335]: {"type":"log","@timestamp":"2019-03-26T13:07:51Z","tags":["warning","elasticsearch","admin"],"pid":1335,"message":"Unable to rev Mar 26 14:07:51 SELKS2 kibana[1335]: {"type":"log","@timestamp":"2019-03-26T13:07:51Z","tags":["warning","elasticsearch","admin"],"pid":1335,"message":"No living con

● evebox.service - EveBox Server Loaded: loaded (/lib/systemd/system/evebox.service; enabled; vendor preset: enabled) Active: active (running) since Tue 2019-03-26 13:50:16 CET; 17min ago Main PID: 599 (evebox) Tasks: 9 (limit: 4915) CGroup: /system.slice/evebox.service └─599 /usr/bin/evebox server

Mar 26 13:50:34 SELKS2 evebox[599]: "minimum_index_compatibility_version" : "5.0.0" Mar 26 13:50:34 SELKS2 evebox[599]: }, Mar 26 13:50:34 SELKS2 evebox[599]: "tagline" : "You Know, for Search" Mar 26 13:50:34 SELKS2 evebox[599]: } Mar 26 13:50:37 SELKS2 evebox[599]: 2019-03-26 13:50:37 (server.go:353) -- Connected to Elastic Search (version: 6.6.2) Mar 26 13:50:37 SELKS2 evebox[599]: 2019-03-26 13:50:37 (elasticsearch.go:199) -- Found templates [logstash] Mar 26 13:50:37 SELKS2 evebox[599]: 2019-03-26 13:50:37 (elasticsearch.go:238) -- Found Elastic Search keyword suffix to be: keyword Mar 26 13:50:37 SELKS2 evebox[599]: 2019-03-26 13:50:37 (server.go:131) -- Session reaper started Mar 26 13:50:37 SELKS2 evebox[599]: 2019-03-26 13:50:37 (server.go:165) -- Authentication disabled. Mar 26 13:50:37 SELKS2 evebox[599]: 2019-03-26 13:50:37 (server.go:276) -- Listening on 0.0.0.0:5636

● molochviewer-selks.service - Moloch Viewer Loaded: loaded (/etc/systemd/system/molochviewer-selks.service; enabled; vendor preset: enabled) Active: active (running) since Tue 2019-03-26 13:51:48 CET; 16min ago Main PID: 1414 (sh) Tasks: 11 (limit: 4915) CGroup: /system.slice/molochviewer-selks.service ├─1414 /bin/sh -c /data/moloch/bin/node viewer.js -c /data/moloch/etc/config.ini >> /data/moloch/logs/viewer.log 2>&1 └─1415 /data/moloch/bin/node viewer.js -c /data/moloch/etc/config.ini

Mar 26 13:51:48 SELKS2 systemd[1]: Started Moloch Viewer.

● molochpcapread-selks.service - Moloch Pcap Read Loaded: loaded (/etc/systemd/system/molochpcapread-selks.service; enabled; vendor preset: enabled) Active: active (running) since Tue 2019-03-26 13:51:46 CET; 16min ago Main PID: 1408 (sh) Tasks: 6 (limit: 4915) CGroup: /system.slice/molochpcapread-selks.service ├─1408 /bin/sh -c /data/moloch/bin/moloch-capture -c /data/moloch/etc/config.ini -m --copy --delete -R /data/nsm/ >> /data/moloch/logs/capture.log 2>&1 └─1409 /data/moloch/bin/moloch-capture -c /data/moloch/etc/config.ini -m --copy --delete -R /data/nsm/

Mar 26 13:51:46 SELKS2 systemd[1]: Started Moloch Pcap Read.

I found this in the elasticsearch log

2019-03-23T19:19:47.561+0100: 440417.456: [Full GC (Allocation Failure) 2019-03-23T19:19:47.561+0100: 440417.456: [CMS: 707840K->707840K(707840K), 0.9491304 secs] 10 14527K->1014519K(1014528K), [Metaspace: 88813K->88813K(1134592K)], 0.9492004 secs] [Times: user=0.95 sys=0.00, real=0.95 secs] 2019-03-23T19:19:48.510+0100: 440418.405: Total time for which application threads were stopped: 0.9498420 seconds, Stopping threads took: 0.0001189 seconds 2019-03-23T19:19:48.511+0100: 440418.406: [Full GC (Allocation Failure) 2019-03-23T19:19:48.511+0100: 440418.406: [CMS: 707840K->707840K(707840K), 0.9474566 secs] 10 14527K->1014514K(1014528K), [Metaspace: 88813K->88813K(1134592K)], 0.9475432 secs] [Times: user=0.95 sys=0.00, real=0.95 secs] 2019-03-23T19:19:49.458+0100: 440419.353: Total time for which application threads were stopped: 0.9481405 seconds, Stopping threads took: 0.0000998 seconds 2019-03-23T19:19:49.459+0100: 440419.354: [Full GC (Allocation Failure) 2019-03-23T19:19:49.459+0100: 440419.354: [CMS: 707840K->707840K(707840K), 0.9445289 secs] 10 14528K->1014502K(1014528K), [Metaspace: 88813K->88813K(1134592K)], 0.9446219 secs] [Times: user=0.95 sys=0.00, real=0.94 secs] 2019-03-23T19:19:50.404+0100: 440420.299: Total time for which application threads were stopped: 0.9451441 seconds, Stopping threads took: 0.0000223 seconds 2019-03-23T19:19:50.405+0100: 440420.300: [Full GC (Allocation Failure) 2019-03-23T19:19:50.405+0100: 440420.300: [CMS: 707840K->707840K(707840K), 0.9606087 secs] 10 14528K->1014493K(1014528K), [Metaspace: 88813K->88813K(1134592K)], 0.9606822 secs] [Times: user=0.96 sys=0.00, real=0.97 secs] 2019-03-23T19:19:51.366+0100: 440421.261: Total time for which application threads were stopped: 0.9612547 seconds, Stopping threads took: 0.0000983 seconds 2019-03-23T19:19:51.367+0100: 440421.261: [Full GC (Allocation Failure) 2019-03-23T19:19:51.367+0100: 440421.261: [CMS: 707840K->707840K(707840K), 0.9444368 secs] 10 14528K->1014355K(1014528K), [Metaspace: 88815K->88815K(1134592K)], 0.9445107 secs] [Times: user=0.94 sys=0.00, real=0.94 secs] 2019-03-23T19:19:52.311+0100: 440422.206: Total time for which application threads were stopped: 0.9450178 seconds, Stopping threads took: 0.0000245 seconds 2019-03-23T19:19:52.313+0100: 440422.208: [Full GC (Allocation Failure) 2019-03-23T19:19:52.313+0100: 440422.208: [CMS: 707840K->707839K(707840K), 1.0560808 secs] 10 14528K->1000096K(1014528K), [Metaspace: 88813K->88813K(1134592K)], 1.0561568 secs] [Times: user=1.05 sys=0.00, real=1.06 secs] 2019-03-23T19:19:53.369+0100: 440423.264: Total time for which application threads were stopped: 1.0568064 seconds, Stopping threads took: 0.0001814 seconds Heap par new generation total 306688K, used 295047K [0x00000000c0000000, 0x00000000d4cc0000, 0x00000000d4cc0000) eden space 272640K, 100% used [0x00000000c0000000, 0x00000000d0a40000, 0x00000000d0a40000) from space 34048K, 65% used [0x00000000d0a40000, 0x00000000d2021cd0, 0x00000000d2b80000) to space 34048K, 0% used [0x00000000d2b80000, 0x00000000d2b80000, 0x00000000d4cc0000) concurrent mark-sweep generation total 707840K, used 707839K [0x00000000d4cc0000, 0x0000000100000000, 0x0000000100000000) Metaspace used 88822K, capacity 97314K, committed 97636K, reserved 1134592K class space used 11008K, capacity 13342K, committed 13468K, reserved 1048576K 2019-03-23T19:19:53.375+0100: 440423.270: [GC (CMS Initial Mark) [1 CMS-initial-mark: 707839K(707840K)] 1002887K(1014528K), 0.0603362 secs] [Times: user=0.07 sys=0.0 0, real=0.06 secs]

It looks, that the SELKS/elasticsearch crash has been on 23.3.2019

Sonda2-errors.tar.gz

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

michal25 commented 5 years ago

~# df -h Filesystem Size Used Avail Use% Mounted on udev 7.8G 0 7.8G 0% /dev tmpfs 1.6G 17M 1.6G 2% /run /dev/md0 887G 796G 46G 95% / tmpfs 7.8G 0 7.8G 0% /dev/shm tmpfs 5.0M 0 5.0M 0% /run/lock tmpfs 7.8G 0 7.8G 0% /sys/fs/cgroup tmpfs 1.6G 0 1.6G 0% /run/user/1001 root@SELKS2:~# free total used free shared buff/cache available Mem: 16321404 3603452 159452 17388 12558500 12362200 Swap: 31249996 0 31249996

Dne 26. 03. 19 v 14:47 Peter Manev napsal(a):

Coule of questions : Is there disk space ? Do you run out of mem (OOM) ?

-- Regards, Peter Manev

-- Ing. Michal Vymazal Senior Cyber Security Architect

Linux Services CEO

vymazal@linuxservices.cz www.linuxservices.cz Office Computer

This mail can't contain any virus. I'm using only Open Source software.

pevma commented 5 years ago

I think that is why - you only have 5% disk space left and ES stops because of that for sure. I think by default stops logging when you are left with 20%.

Maybe an idea - besides disk clean up and more often log rotation clean - aka 10 days (cronjob) - would be to disable some verbose loggers like “fileinfo” (put # comment in front of “fileinfo “ in /etc/Suricata/SELKS5-adding.yaml)

Thank you

-- Regards, Peter Manev

On 26 Mar 2019, at 14:58, michal25 notifications@github.com wrote:

~# df -h Filesystem Size Used Avail Use% Mounted on udev 7.8G 0 7.8G 0% /dev tmpfs 1.6G 17M 1.6G 2% /run /dev/md0 887G 796G 46G 95% / tmpfs 7.8G 0 7.8G 0% /dev/shm tmpfs 5.0M 0 5.0M 0% /run/lock tmpfs 7.8G 0 7.8G 0% /sys/fs/cgroup tmpfs 1.6G 0 1.6G 0% /run/user/1001 root@SELKS2:~# free total used free shared buff/cache available Mem: 16321404 3603452 159452 17388 12558500 12362200 Swap: 31249996 0 31249996

Dne 26. 03. 19 v 14:47 Peter Manev napsal(a):

Coule of questions : Is there disk space ? Do you run out of mem (OOM) ?

-- Regards, Peter Manev

-- Ing. Michal Vymazal Senior Cyber Security Architect

Linux Services CEO

vymazal@linuxservices.cz www.linuxservices.cz Office Computer

This mail can't contain any virus. I'm using only Open Source software. — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.