StamusNetworks / SELKS

A Suricata based IDS/IPS/NSM distro
https://www.stamus-networks.com/open-source/#selks
GNU General Public License v3.0
1.44k stars 284 forks source link

The delete-old-logs does not work! #338

Closed Linn1 closed 2 years ago

Linn1 commented 2 years ago

I use the command "rm -rf " to delete the biggest index of ES,because the indices is too large. I don't have enough space for the large indices.But there is something goes wrong after I delete index.The log shows: INFO Instantiating client object INFO Testing client connectivity INFO Successfully created Elasticsearch cilent object with provide settings ERROR Singleton action failed due to empty index list

How can I fix this problem? Thanks soooo much!!

Linn1 commented 2 years ago

The log is the output of running "delete-old-logs.sh".

pevma commented 2 years ago

What ELK stack version do you have ? (selks-health-check_stamus will output that information)

Linn1 commented 2 years ago

I change the day in delete-old-logs.sh and it works! Maybe it is not a error,I think it is remind me that there is noting to delete. By the way,I use SELKS6.0 There is another thing goes wrong. I mount a device on /data/nsm and the suricata.log showsread the pcap file permission denied!

Linn1 commented 2 years ago

I clean the dir: /data/nsm And then the dir is staying empty status all the time! Where are these new pcaps?

pevma commented 2 years ago

I think it may be related to permissions - writing the pcaps there, depending on which FPC mode you have chosen during setup - https://github.com/StamusNetworks/SELKS/wiki/First-time-setup#full-packet-capture-fpc

Linn1 commented 2 years ago

I run "selks-first-setup_stamus" and chose FPC mode and it works! No more perimission deined!

Linn1 commented 2 years ago

But I have another problem.I found a large dir: /data/moloch/logs/. I chekced it and found these samiliar errors are wrote in capture.log so many times. suricata.c 289 suricata_paocess(): ERROR: Parse_error 1 > {"timestamp":"2021-0930T10:40:01.773996+0800","flow_id":.....} suricata.c 289 suricata_paocess(): ERROR: Parse_error 1 > {"timestamp":"2021-0930T10:40:42.691888+0800","flow_id":.....} It seems that there is something wrong with suricata.I use the command "rm -rf " to delete the biggest index of ES,because the indices is too large. I don't have enough space for the large indices. I don't know if it has something to do with this problem.

Linn1 commented 2 years ago

I run "selks-first-setup_stamus",and it failed.It failed when set up Scirius/Moloch proxy user. Here is the output: START of first time setup script - 2021年 09月 30日 星期四 14:48:32 CST

Setting up sniffing interface

Please supply a network interface(s) to set up SELKS Suricata IDPS thread detection on 0: enp4s0f0 1: enp4s0f1 2: enp4s0f2 3: enp4s0f3 4: lo Please type in interface or space delimited interfaces below and hit "Enter". Example: eth1 OR Example: eth1 eth2 eth3

Configure threat detection for INTERFACE(S): enp4s0f1

The supplied network interface(s): enp4s0f1

DONE! FPC - Full Packet Capture. Suricata will rotate and delete the pcap captured files. FPC_Retain - Full Packet Capture with having Moloch's pcap retention/rotation. Keeps the pcaps as long as there is space available. None - disable packet capture

1) FPC 2) FPC_Retain 3) NONE Please choose an option. Type in a number and hit "Enter" 1 Enable Full Pcacket Capture

Starting Moloch DB set up

% Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 404 100 404 0 0 1990 0 --:--:-- --:--:-- --:--:-- 1980 {"cluster_name":"elasticsearch","status":"yellow","timed_out":false,"number_of_nodes":1,"number_of_data_nodes":1,"active_primary_shards":60,"active_shards":60,"relocating_shards":0,"initializing_shards":0,"unassigned_shards":3,"delayed_unassigned_shards":0,"number_of_pending_tasks":0,"number_of_in_flight_fetch":0,"task_max_waiting_in_queue_millis":0,"active_shards_percent_as_number":95.23809523809523}

Setting up Moloch

WARNING elasticsearch health is 'yellow' instead of 'green', things may be broken

It is STRONGLY recommended that you stop ALL moloch captures and viewers before proceeding. Use 'db.pl http://localhost:9200 backup' to backup db first.

There is 1 elastic search data node, if you expect more please fix first before proceeding.

It appears this elastic search cluster already has moloch installed (version 64), this will delete ALL data in elastic search! (It does not delete the pcap files on disk.)

Type "INIT" to continue - do you want to erase everything?? Erasing Creating

Finished Found interfaces: enp4s0f0;enp4s0f1;enp4s0f2;enp4s0f3;lo Semicolon ';' seperated list of interfaces to monitor [eth1] Install Elasticsearch server locally for demo, must have at least 3G of memory, NOT recommended for production use (yes or no) [no] Elasticsearch server URL [http://localhost:9200] Password to encrypt S2S and other things [no-default] Moloch - Creating configuration files Not overwriting /data/moloch/etc/config.ini, delete and run again if update required (usually not), or edit by hand Installing systemd start files, use systemctl Download GEO files? (yes or no) [yes] Moloch - Downloading GEO files wget: 无法解析主机地址 “www.iana.org” wget: 无法解析主机地址 “raw.githubusercontent.com”

Moloch - Configured - Now continue with step 4 in /data/moloch/README.txt

  /sbin/start elasticsearch # for upstart/Centos 6/Ubuntu 14.04
  systemctl start elasticsearch.service # for systemd/Centos 7/Ubuntu 16.04

5) Initialize/Upgrade Elasticsearch Moloch configuration a) If this is the first install, or want to delete all data /data/moloch/db/db.pl http://ESHOST:9200 init b) If this is an update to moloch package /data/moloch/db/db.pl http://ESHOST:9200 upgrade 6) Add an admin user if a new install or after an init /data/moloch/bin/moloch_add_user.sh admin "Admin User" THEPASSWORD --admin 7) Start everything a) If using upstart (Centos 6 or sometimes Ubuntu 14.04): /sbin/start molochcapture /sbin/start molochviewer b) If using systemd (Centos 7 or Ubuntu 16.04 or sometimes Ubuntu 14.04) systemctl start molochcapture.service systemctl start molochviewer.service 8) Look at log files for errors /data/moloch/logs/viewer.log /data/moloch/logs/capture.log 9) Visit http://MOLOCHHOST:8005 with your favorite browser. user: admin password: THEPASSWORD from step #6

If you want IP -> Geo/ASN to work, you need to setup a maxmind account and the geoipupdate program. See https://molo.ch/faq#maxmind

Any configuration changes can be made to /data/moloch/etc/config.ini See https://molo.ch/faq#moloch-is-not-working for issues

Additional information can be found at:

Setting up Moloch configs and services

Setting up and restarting services

Setting up Scirius/Moloch proxy user

Added Traceback (most recent call last): File "bin/manage.py", line 10, in execute_from_command_line(sys.argv) File "/usr/share/python/scirius/local/lib/python2.7/site-packages/django/core/management/init.py", line 364, in execute_from_command_line utility.execute() File "/usr/share/python/scirius/local/lib/python2.7/site-packages/django/core/management/init.py", line 356, in execute self.fetch_command(subcommand).run_from_argv(self.argv) File "/usr/share/python/scirius/local/lib/python2.7/site-packages/django/core/management/base.py", line 283, in run_from_argv self.execute(*args, *cmd_options) File "/usr/share/python/scirius/local/lib/python2.7/site-packages/django/core/management/base.py", line 330, in execute output = self.handle(args, options) File "/usr/share/python/scirius/local/lib/python2.7/site-packages/rules/management/commands/kibana_reset.py", line 38, in handle self.kibana_reset() File "/usr/share/python/scirius/local/lib/python2.7/site-packages/rules/es_data.py", line 1983, in kibana_reset self._kibana_remove('index-pattern', {'query': {'query_string': {'query': ''}}}) File "/usr/share/python/scirius/local/lib/python2.7/site-packages/rules/es_data.py", line 1759, in _kibanaremove res = self.client.search(index='.kibana', from=i, doc_type=_type, body=body, request_cache=False) File "/usr/share/python/scirius/local/lib/python2.7/site-packages/elasticsearch/client/utils.py", line 84, in _wrapped return func(args, params=params, kwargs) File "/usr/share/python/scirius/local/lib/python2.7/site-packages/elasticsearch/client/init.py", line 852, in search "GET", _make_path(index, doc_type, "_search"), params=params, body=body File "/usr/share/python/scirius/local/lib/python2.7/site-packages/elasticsearch/transport.py", line 358, in perform_request timeout=timeout, File "/usr/share/python/scirius/local/lib/python2.7/site-packages/elasticsearch/connection/http_urllib3.py", line 231, in perform_request self._raise_error(response.status, raw_data) File "/usr/share/python/scirius/local/lib/python2.7/site-packages/elasticsearch/connection/base.py", line 230, in _raise_error status_code, error_message, additional_info elasticsearch.exceptions.TransportError: TransportError(429, u'circuit_breaking_exception', u'[parent] Data too large, data for [] would be [1062178446/1012.9mb], which is larger than the limit of [1020054732/972.7mb], real usage: [1062178328/1012.9mb], new bytes reserved: [118/118b], usages [request=312360/305kb, fielddata=45833516/43.7mb, in_flight_requests=12216422/11.6mb, accounting=20373568/19.4mb]') Dashboards loading set up job failed...Exiting...

Exited with ERROR

FINISH of first time setup script - 2021年 09月 30日 星期四 14:54:32 CST

Exited with FAILED Full log located at - /opt/selks/log/selks-first-time-setup_stamus.log Press enter to continue

I think it is the removed index that I delete caused this problem.

pevma commented 2 years ago

What selks version are you using ? What is the output of selks-health-check_stamus (as requested earlier https://github.com/StamusNetworks/SELKS/issues/338#issuecomment-929002073 :) )

Linn1 commented 2 years ago

I use SELKS6.0 and it is fixed when I run "reboot now". But dele-old-log.sh still doesn't work. The output of running dele-old-log: root@selks:~# /opt/selks/delete-old-logs.sh 2021-10-08 10:26:57,502 INFO Instantiating client object 2021-10-08 10:26:57,503 INFO Testing client connectivity 2021-10-08 10:26:57,508 INFO Successfully created Elasticsearch client object with provided settings 2021-10-08 10:28:57,532 CRITICAL Failed to complete action: delete_indices. <class 'curator.exceptions.FailedExecution'>: Failed to get indices. Error: TransportError(503, 'master_not_discovered_exception', None)

The space is 100% used. What can I delete to free some space?

pevma commented 2 years ago

You can delete older logs , default is 14 days retention, so try to go down to 5-7? https://github.com/StamusNetworks/SELKS/wiki/Data-lifecycle

Linn1 commented 2 years ago

I tried to go down 3 days and it still failed! THe following is the output: root@selks:~# sh /opt/selks/delete-old-logs.sh 2021-10-13 14:18:11,530 INFO Instantiating client object 2021-10-13 14:18:11,531 INFO Testing client connectivity 2021-10-13 14:18:11,537 INFO Successfully created Elasticsearch client object with provided settings 2021-10-13 14:18:11,738 INFO Deleting 41 selected indices: ['logstash-smtp-2021.10.10', 'logstash-2021.10.08', 'logstash-tls-2021.10.09', 'logstash-smtp-2021.10.08', 'logstash-anomaly-2021.10.09', 'logstash-fileinfo-2021.10.09', 'logstash-alert-2021.10.08', 'logstash-smb-2021.10.09', 'logstash-alert-2021.10.10', 'logstash-tls-2021.10.10', 'logstash-tls-2021.10.08', 'logstash-2021.10.09', 'logstash-dns-2021.10.10', 'logstash-fileinfo-2021.10.10', 'logstash-dhcp-2021.10.08', 'logstash-flow-2021.10.08', 'logstash-sip-2021.10.10', 'logstash-snmp-2021.10.09', 'logstash-dns-2021.10.08', 'logstash-alert-2021.10.09', 'logstash-tftp-2021.10.08', 'logstash-dhcp-2021.10.09', 'logstash-sip-2021.10.08', 'logstash-http-2021.10.10', 'logstash-ikev2-2021.10.09', 'logstash-snmp-2021.10.10', 'logstash-ikev2-2021.10.08', 'logstash-flow-2021.10.10', 'logstash-anomaly-2021.10.08', 'logstash-fileinfo-2021.10.08', 'logstash-http-2021.10.08', 'logstash-snmp-2021.10.08', 'logstash-flow-2021.10.09', 'logstash-tftp-2021.10.10', 'logstash-dns-2021.10.09', 'logstash-sip-2021.10.09', 'logstash-http-2021.10.09', 'logstash-dhcp-2021.10.10', 'logstash-2021.10.10', 'logstash-anomaly-2021.10.10', 'logstash-tftp-2021.10.09'] 2021-10-13 14:18:11,738 INFO ---deleting index logstash-smtp-2021.10.10 2021-10-13 14:18:11,738 INFO ---deleting index logstash-2021.10.08 2021-10-13 14:18:11,738 INFO ---deleting index logstash-tls-2021.10.09 2021-10-13 14:18:11,739 INFO ---deleting index logstash-smtp-2021.10.08 2021-10-13 14:18:11,739 INFO ---deleting index logstash-anomaly-2021.10.09 2021-10-13 14:18:11,739 INFO ---deleting index logstash-fileinfo-2021.10.09 2021-10-13 14:18:11,739 INFO ---deleting index logstash-alert-2021.10.08 2021-10-13 14:18:11,739 INFO ---deleting index logstash-smb-2021.10.09 2021-10-13 14:18:11,739 INFO ---deleting index logstash-alert-2021.10.10 2021-10-13 14:18:11,739 INFO ---deleting index logstash-tls-2021.10.10 2021-10-13 14:18:11,739 INFO ---deleting index logstash-tls-2021.10.08 2021-10-13 14:18:11,740 INFO ---deleting index logstash-2021.10.09 2021-10-13 14:18:11,740 INFO ---deleting index logstash-dns-2021.10.10 2021-10-13 14:18:11,740 INFO ---deleting index logstash-fileinfo-2021.10.10 2021-10-13 14:18:11,740 INFO ---deleting index logstash-dhcp-2021.10.08 2021-10-13 14:18:11,740 INFO ---deleting index logstash-flow-2021.10.08 2021-10-13 14:18:11,740 INFO ---deleting index logstash-sip-2021.10.10 2021-10-13 14:18:11,740 INFO ---deleting index logstash-snmp-2021.10.09 2021-10-13 14:18:11,740 INFO ---deleting index logstash-dns-2021.10.08 2021-10-13 14:18:11,741 INFO ---deleting index logstash-alert-2021.10.09 2021-10-13 14:18:11,741 INFO ---deleting index logstash-tftp-2021.10.08 2021-10-13 14:18:11,741 INFO ---deleting index logstash-dhcp-2021.10.09 2021-10-13 14:18:11,741 INFO ---deleting index logstash-sip-2021.10.08 2021-10-13 14:18:11,741 INFO ---deleting index logstash-http-2021.10.10 2021-10-13 14:18:11,741 INFO ---deleting index logstash-ikev2-2021.10.09 2021-10-13 14:18:11,741 INFO ---deleting index logstash-snmp-2021.10.10 2021-10-13 14:18:11,742 INFO ---deleting index logstash-ikev2-2021.10.08 2021-10-13 14:18:11,742 INFO ---deleting index logstash-flow-2021.10.10 2021-10-13 14:18:11,742 INFO ---deleting index logstash-anomaly-2021.10.08 2021-10-13 14:18:11,742 INFO ---deleting index logstash-fileinfo-2021.10.08 2021-10-13 14:18:11,742 INFO ---deleting index logstash-http-2021.10.08 2021-10-13 14:18:11,742 INFO ---deleting index logstash-snmp-2021.10.08 2021-10-13 14:18:11,742 INFO ---deleting index logstash-flow-2021.10.09 2021-10-13 14:18:11,742 INFO ---deleting index logstash-tftp-2021.10.10 2021-10-13 14:18:11,743 INFO ---deleting index logstash-dns-2021.10.09 2021-10-13 14:18:11,743 INFO ---deleting index logstash-sip-2021.10.09 2021-10-13 14:18:11,743 INFO ---deleting index logstash-http-2021.10.09 2021-10-13 14:18:11,743 INFO ---deleting index logstash-dhcp-2021.10.10 2021-10-13 14:18:11,743 INFO ---deleting index logstash-2021.10.10 2021-10-13 14:18:11,743 INFO ---deleting index logstash-anomaly-2021.10.10 2021-10-13 14:18:11,743 INFO ---deleting index logstash-tftp-2021.10.09 2021-10-13 14:20:11,758 CRITICAL Failed to complete action: delete_indices. <class 'curator.exceptions.FailedExecution'>: Exception encountered. Rerun with loglevel DEBUG and/or check Elasticsearch logs for more information. Exception: ConnectionTimeout caused by - ReadTimeout(HTTPConnectionPool(host='127.0.0.1', port=9200): Read timed out. (read timeout=30))

pevma commented 2 years ago

It seems it reached the default time out for eS read - Read timed out. Please run it again , it may have been a few left.

Linn1 commented 2 years ago

It shows: CRITICAL Failed to complete action: delete_indices. <class 'curator.exceptions.FailedExecution'>: Failed to get indices. Error: TransportError(503, 'master_not_discovered_exception', None)

Linn1 commented 2 years ago

I don't know if it relevant to running rm -rf *** (*** is the biggest index of es located /var/lib/elasticsearch/nodes/0/indices/). I just want to free some space because the space is 100% used.

pevma commented 2 years ago

Seems ES is not online, what would the status of the process return? rm is not a recommended way to delete ES indices , though it will delete them :)

Linn1 commented 2 years ago

The status of es is active(running). Here is the output:

root@selks:~# systemctl status elasticsearch
● elasticsearch.service - Elasticsearch
   Loaded: loaded (/lib/systemd/system/elasticsearch.service; enabled; vendor preset: enabled)
   Active: active (running) since Wed 2021-09-29 17:41:58 CST; 1 weeks 6 days ago
     Docs: https://www.elastic.co
 Main PID: 1352 (java)
    Tasks: 520 (limit: 9830)
   Memory: 2.6G
   CGroup: /system.slice/elasticsearch.service
           ├─1352 /usr/share/elasticsearch/jdk/bin/java -Xshare:auto -Des.networkaddress.cache.ttl=60 -Des.networkaddress.cache.nega
           └─1608 /usr/share/elasticsearch/modules/x-pack-ml/platform/linux-x86_64/bin/controller

10月 13 15:44:38 selks systemd-entrypoint[1352]:         at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:286) ~[?:?]
10月 13 15:44:38 selks systemd-entrypoint[1352]:         at java.nio.channels.Channels.writeFullyImpl(Channels.java:74) ~[?:?]
10月 13 15:44:38 selks systemd-entrypoint[1352]:         at java.nio.channels.Channels.writeFully(Channels.java:97) ~[?:?]
10月 13 15:44:38 selks systemd-entrypoint[1352]:         at java.nio.channels.Channels$1.write(Channels.java:172) ~[?:?]
10月 13 15:44:38 selks systemd-entrypoint[1352]:         at org.apache.lucene.store.FSDirectory$FSIndexOutput$1.write(FSDirectory.ja
10月 13 15:44:38 selks systemd-entrypoint[1352]:         at java.util.zip.CheckedOutputStream.write(CheckedOutputStream.java:73) ~[?
10月 13 15:44:38 selks systemd-entrypoint[1352]:         at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:81) ~
10月 13 15:44:38 selks systemd-entrypoint[1352]:         at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:142) ~[?:?]
10月 13 15:44:38 selks systemd-entrypoint[1352]:         at org.apache.lucene.store.OutputStreamIndexOutput.getChecksum(OutputStream
10月 13 15:44:38 selks systemd-entrypoint[1352]:         at org.apache.lucene.codecs.CodecUtil.writeCRC(CodecUtil.java:548) ~[lucene

I see. I won't do it again. There is nothing I can do except rm some files which are seemed useless to free some space. Thanks so much for your patience.

Linn1 commented 2 years ago

I run selks-db-logs-clean-up and it didn't work. The space is 100% used. And elasticsearch goes stop. I tried to start it but not work. There is no space for running application! I don't know how to fix it!

pevma commented 2 years ago

Ideally it would be best if you calculate usage per day and the disk you have and make sure you either clean up via the cronjob more often or get some more space/disks.

Otherwise, probably it is best to: 1 - stop Suricata process. 2 - remove the older information or indices 3 - Adjust ES to go back from read only mode

curl -X PUT "localhost:9200/_all/_settings" -H 'Content-Type: application/json' -d'{ "index.blocks.read_only_allow_delete" : null } }'

4 - restart ES 5 - start Suricata