stratosphereips / StratosphereLinuxIPS

Slips, a free software behavioral Python intrusion prevention system (IDS/IPS) that uses machine learning to detect malicious behaviors in the network traffic. Stratosphere Laboratory, AIC, FEL, CVUT in Prague.
Other
706 stars 176 forks source link

Add port information for all alerts in alerts.json #977

Closed maldwg closed 2 months ago

maldwg commented 2 months ago

Describe the bug When analyzing any file or network traffic, the logs in the alerts.json are missing the port information for the source and destination most of the times. As far as I can tell, only for requests categorized as type "Recon.Scanning", this works right now. For every other type, it does not. At least the information for the destination port is available, as the alerts.log print it into the alert message. Hence it should not be an issue of whether or not there is information about the ports, but rather that it is not included into the alerts.json.

Example logs :

alerts.json

{"Format": "IDEA0", "ID": "777a1b25-45a7-432d-955f-450147169cd3", "DetectTime": "2024-08-30T10:58:38.796007+00:00", "EventTime": "2024-08-30T10:58:38.796025+00:00", "Category": ["Recon.Scanning"], "Confidence": 1, "Source": [{"IP4": ["192.168.2.16"], "Port": ["23"], "Proto": ["TCP"], "Type": ["Recon"]}], "Attach": [{"Content": "Horizontal port scan to port  23/TCP. From 192.168.2.16 to 50 unique destination IPs. Total packets sent: 245. Confidence: 1. by Slips threat level: high.", "ContentType": "text/plain"}], "ConnCount": 245, "uids": ["C3shpb1Vv2DLZ1PdN7", "CNjUUQ3KmYQwt2w5W8", "CiKLPi2l4l5UVb1Jrj", "CR1izvLdzMb1Jgwfj", "CpaDej2CEpWw4edUY2", "CNzz982egm3RoxPXRh", "CepWRwJVue4DhNwC1", "Cb29KR1AJmQD8bgGaj", "CnK23c3wmWt1gCBYJ3", "CEWtZK31inyW6XHZI9", "Ce6NFD2X0ZkcmJL1a3", "CHKxGv2c97kcewBuU3", "Ceha1n321Jq1UDtCYg", "CCHfkZTc8xGnHBTKh", "CZVE544zV4zvHyKeFa", "CbNOSYcDpm4ht9Snc", "Ck7tEJ2FIG4vXCAK61", "CEdEbYf5O6bnUFyzi", "CPEGLh4sy8RTjZl694", "CgEXjQ31uILwQtBa54", "CSAGhi2VDkN5vnN48a", "CeOsAi4cDvT1gdLXV9", "CRZhPN3ksDvpy059G1", "Cm5vUv1nmIhCXrTDcf", "CYNBRq2ouDkPPoWxO3", "CAIZWp3jkW02jpsTHc", "CJjv9RGxWhdDd4hJk", "Cfub3WZ05I8xAe3i8", "CdWA4x3rHvg1zQ9Ofi", "CIo1qa4y7rehHly4Y4", "CIHxCI3ewOtD61evN3", "C9ncMn3yh7GLNjWt14", "Cm4bJ42jizhSEgfzs4", "CfjUHYXhj6U6uHeH3", "CezR4j2FcPeF9UNm3e", "CSRpVQdvWgSC2T911", "Cvez981EJbw0qCS1Ol", "CIedNp1KNMkc6Rnvai", "CHXJvW1aQEQQRNzmw2", "CO0pTVgorfAZkSZt2", "CEg7do1NujzvbUXbsf", "C5bzWv1IijF2pPxDFl", "C4uTip4u2JGzIhwvx5", "CCZ15x12beba3ePb4", "CHCIYa4LnJNLDm3pm5", "CH9MGe3lfKm4Qm2ED4", "CwNeHSUrIZYERi7U9", "CF0W5F3ZFBtUnVt2D3", "CCMOxY3CrtTTOUByXi", "CUXrgD4srWScrqH002"], "accumulated_threat_level": 7.999999999999999, "timewindow": 1}
{"Format": "IDEA0", "ID": "c695fc69-3743-4f87-8742-5a080dd603a5", "DetectTime": "2024-08-30T10:58:46.257302+00:00", "EventTime": "2024-08-30T10:58:46.257316+00:00", "Category": ["Recon"], "Confidence": 1.0, "Source": [{"IP4": ["192.168.2.16"]}], "Target": [{"IP4": ["192.168.2.1"]}], "Attach": [{"Content": "Connecting to private IP: 192.168.2.1 on destination port: 67 threat level: info.", "ContentType": "text/plain"}], "ConnCount": 1, "uids": ["C4r9z41lV0wYA6O0S8"], "accumulated_threat_level": 7.999999999999999, "timewindow": 1}
{"Format": "IDEA0", "ID": "46dcbd0b-83e0-4c24-b999-92f3590c6c35", "DetectTime": "2024-08-30T10:58:48.294156+00:00", "EventTime": "2024-08-30T10:58:48.294168+00:00", "Category": ["Recon"], "Confidence": 1.0, "Source": [{"IP4": ["192.168.2.12"]}], "Target": [{"IP4": ["192.168.2.1"]}], "Attach": [{"Content": "Connecting to private IP: 192.168.2.1 on destination port: 67 threat level: info.", "ContentType": "text/plain"}], "ConnCount": 1, "uids": ["Cr7O7m1vZHluSJ8ti6"], "accumulated_threat_level": 0.4, "timewindow": 1}
{"Format": "IDEA0", "ID": "81665f73-b48a-48fa-8654-79f81eb25d41", "DetectTime": "2024-08-30T10:58:48.322338+00:00", "EventTime": "2024-08-30T10:58:48.322367+00:00", "Category": ["Anomaly.Traffic"], "Confidence": 1.0, "Source": [{"IP4": ["169.254.242.182"]}], "Target": [{"IP4": ["224.0.0.251"]}], "Attach": [{"Content": "A connection from a private IP (169.254.242.182) outside of the used local network 192.168.0.0/16. To IP: 224.0.0.251  threat level: low.", "ContentType": "text/plain"}], "ConnCount": 1, "uids": ["CcXD4y30wyUvY7IVsb"], "accumulated_threat_level": 0.2, "timewindow": 1}
{"Format": "IDEA0", "ID": "d754b22b-c238-45a3-93e3-852c7378efa0", "DetectTime": "2024-08-30T10:58:49.112013+00:00", "EventTime": "2024-08-30T10:58:49.112025+00:00", "Category": ["Information"], "Confidence": 0.8, "Source": [{"IP4": ["192.168.2.1"]}], "Target": [{"IP4": ["192.168.2.16"]}], "Attach": [{"Content": "SSH successful to IP 192.168.2.16. . From IP 192.168.2.1. Sent bytes: 29203. Detection model Slips. Confidence 0.8 threat level: info.", "ContentType": "text/plain"}], "ConnCount": 1, "uids": ["CF5N7412C9psBcflSk"], "accumulated_threat_level": 0, "timewindow": 1}
{"Format": "IDEA0", "ID": "ed714ee9-6673-4361-a13e-ba86ac1066d6", "DetectTime": "2024-08-30T10:58:51.129873+00:00", "EventTime": "2024-08-30T10:58:51.129888+00:00", "Category": ["Anomaly.Connection"], "Confidence": 0.8, "Source": [{"IP4": ["192.168.2.16"], "Type": ["Malware"]}], "Attach": [{"Content": "A connection without DNS resolution to IP: 212.96.160.147  AS: ITSELF Network and internet service provider., CZ AS12570 rDNS: server.janbosko.cz threat level: info.", "ContentType": "text/plain"}], "ConnCount": 1, "uids": ["CXTY98kGXw8Wc5XLa"], "accumulated_threat_level": 7.999999999999999, "timewindow": 1}

alerts.log

2018-03-09T20:57:44.781449+00:00 (TW 1): Src IP 192.168.2.16              . Detected Horizontal port scan to port  23/TCP. From 192.168.2.16 to 50 unique destination IPs. Total packets sent: 245. Confidence: 1. by Slips threat level: high. 
2018-03-09T20:54:02.285409+00:00 (TW 1): Src IP 192.168.2.16              . Detected Connecting to private IP: 192.168.2.1 on destination port: 67 threat level: info. 
2018-03-09T20:53:45.569594+00:00 (TW 1): Src IP 192.168.2.12              . Detected Connecting to private IP: 192.168.2.1 on destination port: 67 threat level: info. 
2018-03-09T20:55:51.299831+00:00 (TW 1): Src IP 169.254.242.182           . Detected A connection from a private IP (169.254.242.182) outside of the used local network 192.168.0.0/16. To IP: 224.0.0.251  threat level: low. 
2018-03-09T20:53:15.490027+00:00 (TW 1): Src IP 192.168.2.1               . Detected SSH successful to IP 192.168.2.16. . From IP 192.168.2.1. Sent bytes: 29203. Detection model Slips. Confidence 0.8 threat level: info. 
2018-03-09T20:49:54.304700+00:00 (TW 1): Src IP 192.168.2.16              . Detected A connection without DNS resolution to IP: 212.96.160.147  AS: ITSELF Network and internet service provider., CZ AS12570 rDNS: server.janbosko.cz threat level: info. 

As you can see, the port is only shown in the first entry of type Recon.Scanning, and even though the alerts.log show further port information, it is not displayed in alerts.json, which makes using it rather difficult.

To Reproduce Steps to reproduce the behavior:

  1. Use the latest dockerized version of slips
  2. setup the container like mentioned in the Readme using docker run --rm -it--net=host --cap-add=NET_ADMIN --name slips stratosphereips/slips:latest
  3. Run a test analyzation for any given file e.g. : ./slips.py -f dataset/test7-malicious.pcap -o ./output_dir

Expected behavior The alerts.json and alerts.log should display equal information about the requests. Started by a common timestamp and also regarding request attributes like IP and port information.

Environment (please complete the following information):

Additional context If the above described behavior is on purpose, then this issue should be a feature request, if not, this should remain a bug request. However, knowing the source and destination ports is important, so I classified it as a bug.

AlyaGomaa commented 2 months ago

hello @maldwg i completely agree with you, we've been discussing this with @eldraco for a while now and we agreed to use a new format and add more info to the logged evidence/alerts, see #839 for more details.

maldwg commented 2 months ago

@AlyaGomaa Ok great thank you for the information :) As I can see from https://github.com/stratosphereips/StratosphereLinuxIPS/issues/839 , you are planning to switch from the IDEA0 format in the future, if a good alternative has proven more effective. However, do you have a suggestion or maybe even an unofficial workaround for now that can gather all available information for one request (preferably by using automatable cli commands only) ? My goal is to use slips to analyze test datasets and I need to assign the alerts of slips to my labeled test data. The timestamp itself together with the source and (if available) destination IP has sadly proven to be insufficient for that.

AlyaGomaa commented 2 months ago

So 2 options if your test data is zeek logs

  1. you can easily map the uids of each evidence from alerts.json to the same uid in your dataset. that can be done with a python script easily.
  2. Slips has a flow.sqlite db in the output directory of each run, this database has 2 tables called flows and altflows, in each table there's a column called "label", so the label of each flow is either malicious or benign.
    • flows are labeled malicious if there part of an alert (alert not evidence)
    • and they are labeled benign otherwise
maldwg commented 2 months ago

Thank you for your quick reply again :)

I tried your approaches, but came across the following problems:

1: My test data are external data like the CICIDS 2017 dataset and thus not zeek logs. If you are referring to the zeek_files in the output directory, I tried mapping the requests logged there to the alerts.json, but the uid's are differently build as far as I am aware. Or is there a way to transform one format like "uid":"CHnKXz3560z0KTTbRa from a zeek log to the format of the alerts.json? "ID": "1f7ff3a2-391b-4613-8b7e-89f29dfdb20a"

2: For the second approach, I would be totally fine to just use the db entries. However, the problem is, that every entry in these 2 tables is labeled benign, even though it probably shouldn't (I am using the datasets provided like test7-malcious.pcap for quick tests). I tried mapping the DB entries to the alerts.json file anyway in the hopes that this might be just a labeling issue. Nevertheless, most of the alerts from alerts.json could not be assigned to a DB entry (IDs of the DB entries are again different from the alerts.json, and a mapping by timestamp, src-ip, and dst-ip was not successful either). So is this wanted that every dataset yields only benign labeled db entries? And is it possible to map the entries in another way to the alerts?

And one last thing: you wrote

each evidence from alerts.json

I thought alerts.json were alerts and not evidence entries, or am I mistaken? If they are only evidence, this would mean that not every entry in that file can be regarded as malicious and would need to be compared to the DB entries, right ?

AlyaGomaa commented 2 months ago

Hi so to break this down

  1. I just checked the CICIDS 2017 dataset and I saw the traffic is PCAP right? when you give slips a PCAP, the generated zeek logs are stored in the output_dir/zeek_files as you said, now the uid of each zeek log file can be mapped to the "uids" field of alerts.json entries so, for example after running slips on dataset/test7-malicious.pcap Screenshot_20240830_223352 I searched with the uid of a random evidence (from alerts.json) inside the generated zeek_files/ of that pcap and the uid was found in conn.log and dns.log
  2. if you wanna use the sqlite db we are also using the same zeek uids there image

Now about the labels, if all flows in the database are labeled as benign, it means slips didn't generate an alert, it may have generated many evidence, but these evidence were not enough for an alert. and only flows that were part of an alert are marked as malicious.

Also no, not everything in alerts.json is an alert, I agree the name of the file may be a little misleading, but the file has alerts and evidence, not just alerts. this is an example of an alert in alerts.json

{"Format": "Json", "ID": "4edfc67a-dd8c-4fc6-a3be-ac294521c166", "DetectTime": "2024-08-30T22:42:13.328747+03:00", "EventTime": "2024-08-30T22:42:13.328756+03:00", "Category": "Alert", "Confidence": 1.0, "Source": [{"IP4": ["192.168.1.129"], "Type": ["BlacklistedIP"]}], "Target": [{"IP4": ["122.248.252.67"]}], "Attach": [{"Content": "2021-06-06T18:32:28.297488+03:00: Src IP 192.168.1.129             . Generated an alert given enough evidence on timewindow 1. (real time 2024/08/30 22:42:13.368528)", "ContentType": "text/plain"}], "ConnCount": 1, "uids": [], "accumulated_threat_level": 12.1, "timewindow": 1, "profileid": "profile_192.168.1.129", "threat_level": 12.1}

if you do cat alerts.json | grep -i alert you'll get all alerts if there was any.

You can completely ignore the "ID" field of each evidence/alert in the alerts.json, there are random identifiers generated by slips required by the IDEA0 format, you can't map them to zeek logs

maldwg commented 2 months ago

Ok sorry, now I know which uid you meant, thank you so much for your extensive explanation! I looked at the totally wrong place 😅
Now I can successfully assign any flow to an alert/evidence from the alerts.json and get all the metadata. Thank you so much @AlyaGomaa !

AlyaGomaa commented 2 months ago

hey no problem glad you figured it out :))