StamusNetworks / SELKS

A Suricata based IDS/IPS/NSM distro
https://www.stamus-networks.com/open-source/#selks
GNU General Public License v3.0
1.49k stars 285 forks source link

Suricata DATAREP engine #295

Open ManuelFFF opened 3 years ago

ManuelFFF commented 3 years ago

Hi,

After checking the official documentation, I have a few questions about the DATAREP engine:

  1. Do I need to define a datarep file the same way I do for a dataset file, in Suricata config file or inside the rules, or can I just create an empty file and fill it with an item-to-block per line?
  2. Do I need to encrypt the whole file to base64, or just each item, or a datarep file does not need encoding at all? I will be using string type for my lists.
  3. Where should I place the datarep file _dnsstring, or how to tell the rule where to find it, to be used in a rule like the following example:

alert dns any any -> any any (dns.query; datarep:dns_string, >, 200, load dns_string.rep, type string; sid:3;)

Thank you in advance

pevma commented 3 years ago

You can do it on the fly inside the rule - like explained here - https://suricata.readthedocs.io/en/suricata-6.0.1/rules/datasets.html?highlight=dns%20rep#datarep If it is to just match on domains - you can simply use datasets as well.

ManuelFFF commented 3 years ago

I tried to setup a dataset a couple of weeks ago and it did not went well. That's why I gave up with datasets a few weeks ago, but with some help I am willing to try any number of times. After checking the official documentation, I still have a few questions about the DATAREP engine. If you could help me understand better, I should be able to move forward.

  1. Does datarep rely on datasets?
  2. Can't I use datarep if I don't create a dataset first?
  3. Do I need to encrypt the whole file to base64, or just each item, or a datarep file does not need encoding at all? I will be using string type for my lists.

Thank you in advance

ManuelFFF commented 3 years ago

With your help I am willing to try any number of times with DATASET or DATAREP. Just need a small push.

pevma commented 3 years ago

1 - no, they are different and can function independently 3 - The purpose of the base64 is not encryption - but rather handling of special characters in URL. Yes for datastes it needs to be one per line in base64 format. So you have a list of domains lets say , one domain per line, convert those to base64 one per line (no white spaces/new lines etc) and that should do it.

ManuelFFF commented 3 years ago

I should have used "encoding" instead of "encryption", but you got the idea ... :P

Thank you for the info. I will try one more time

ManuelFFF commented 3 years ago
  1. Do I have to use only one datarep.rep file, or can I use several datarep files like the iprep files below?:
# IP Reputation
reputation-categories-file: /etc/suricata/rules/scirius-categories.txt
default-reputation-path: /etc/suricata/rules/iprep/
reputation-files:
 - scirius-iprep.list
# - test-iprep.list
 - test-iprepv4.list
 - test-iprepv6.list
  1. Where should I put these file(s) so Suricata can see them and load them? Any place/folder within /etc/suricata/rules/? Can I use a folder for iprep files and a different folder for datarep files?
  2. Is there a section or namespace in /etc/suricata/selks6-addin.yaml to declare such location, or I don't need that?
ManuelFFF commented 3 years ago

Update:

I have found the answer to some of the questions by myself. Please correct me if I'm wrong, as I am trying to understand better this tool.

  1. I can use several datarep.rep files, although there is no specific section to place or define the datarep, like I do with iprep in /etc/suricata/selks6-addin.yaml
  2. I can put the datarep files wherever I want and specify the location within the rule body

alert dns any any -> any any (msg:"TEST Bad Reputation Domain"; dns.query; datarep:test-datarep64,>,99, load /etc/suricata/rules/iprep/test-datarep64.rep, type string; sid:16; rev:1;)

Notes:

[9859] 26/2/2021 -- 14:07:11 - (datasets.c:298) <Config> (DatasetLoadString) -- dataset: test-datarep64 loading from '/etc/suricata/rules/iprep/test-datarep64.rep'
[9859] 26/2/2021 -- 14:07:11 - (datasets.c:365) <Config> (DatasetLoadString) -- dataset: test-datarep64 loaded 4 records
[9859] 26/2/2021 -- 14:07:11 - (detect-engine-loader.c:355) <Info> (SigLoadSignatures) -- 1 rule files processed. 23988 rules successfully loaded, 0 rules failed

Original file

maxmind.com,100
www.maxmind.com,100
debian.org,100
www.debian.org,100

Encoded file (the file itself it is not encoded to base64, but its content, each line, one at a time)

bWF4bWluZC5jb20sMTAwCg==
d3d3Lm1heG1pbmQuY29tLDEwMAo=
ZGViaWFuLm9yZywxMDAK
d3d3LmRlYmlhbi5vcmcsMTAwCg==

Result: The sites are not being blocked when trying from a machine behind Suricata.

Please help to understand and find what could be failing here.

Thank you

pevma commented 3 years ago

Your dataset file should be just (then convert it to base64 of course)

maxmind.com   
www.maxmind.com   
debian.org   
www.debian.org   
ManuelFFF commented 3 years ago

File content is already encoded to base64 as showed in previous post

maxmind.com,100   <=>   bWF4bWluZC5jb20sMTAwCg==
www.maxmind.com,100   <=>   d3d3Lm1heG1pbmQuY29tLDEwMAo=
debian.org,100   <=>   ZGViaWFuLm9yZywxMDAK
www.debian.org,100 <=>   d3d3LmRlYmlhbi5vcmcsMTAwCg==

To check, I decoded each line, one by one and returned the readable domain.

The procedure used was to encode each line to base64, then add it to the file test-datarep64.rep. After that Suricata detects 4 records, as exposed in the log portion in my previous post ("... dataset: test-datarep64 loaded 4 records ...")

Just in case, I also tried a different procedure: added all lines to the file (without base64 encoding), then encoded the whole file as a unit.

base64 /etc/suricata/rules/iprep/test-datarep.rep >> /etc/suricata/rules/iprep/test-datarep64.rep

After that Suricata only detects 2 records when loading datarep file test-datarep64.rep

ManuelFFF commented 3 years ago

Checking again your response, I need to double check few things with you:

maxmind.com,100

Then the rule compares:

datarep:tss-datarep64,>,99 and drop because the value assigned to the line is above 99 and match.

If I just enter lines without a numeric reputation value (100) to the datarep, then how the rule is going perform the comparison later?

pevma commented 3 years ago

I think you are mixing things up a bit. If you use datasets you only need to specify the domain in the dataste file ( just like that - https://github.com/StamusNetworks/SELKS/issues/295#issuecomment-786906518 ).

You only need to convert to base64 if you are using datastes.

I would suggest lets start with dataset domain matching and leave the datarep aside - just till you setup the datsets and have them up and running.

ManuelFFF commented 3 years ago

I apologize if I did not respond sooner. I have been very busy working on several projects at once

I was afraid of that. I could be misunderstanding the documentation. I don't have any problem with trying datasets first. In fact, if I can get dataset working properly, I could use the same engine to reach both goals: to block large amount of IPs and domains, given several black lists (or datasets). I would not need to use IPREP anymore, which is working great, but if I can have just one engine instead of two, doing all the work...

So, starting with datasets, here are some notes to consider from my previous attempt:

This (encoded whole file as a unit):

bWF4bWluZC5jb20gICAKd3d3Lm1heG1pbmQuY29tICAgCmRlYmlhbi5vcmcgICAKd3d3LmRlYmlhbi5vcmcg

instead of this (encoded individual lines):

bWF4bWluZC5jb20gICA=
d3d3Lm1heG1pbmQuY29tICAg
ZGViaWFuLm9yZyAgIA==
d3d3LmRlYmlhbi5vcmcg

I'll be using a rule like this (please correct me if you see something wrong):

drop dns any any -> any any (msg:"TEST Known Bad Domain"; dns.query; dataset:isset, test-datasetDNS64, type string, load /etc/suricata/rules/dataset/test-datasetDNS64.lst; sid:17; rev:1;)

Question: Do I need to include both versions maxmind.com and also www.maxmind.com to ensure the domain is blocked or just maxmind.com would be sufficient?

ManuelFFF commented 3 years ago

Update: (please check previous post first)

maxmind.com   
www.maxmind.com   
debian.org   
www.debian.org

$ base64 /etc/suricata/rules/dataset/test-datasetDNS.lst >> /etc/suricata/rules/dataset/test-datasetDNS64.lst

drop dns any any -> any any (msg:"TEST Known Bad Domain"; dns.query; dataset:isset, test-datasetDNS64, type string, load /etc/suricata/rules/dataset/test-datasetDNS64.lst; sid:17; rev:1;)

Source test failure:

    SC_ERR_FATAL: bad base64 encoding test-datarep//etc/suricata/rules/datarep/test-datarep.rep
ManuelFFF commented 3 years ago

Based on my two previous post, do you think there is an issue with Suricata decoding base64 data?

pevma commented 3 years ago

Ok lets start with dataset first.

What domains failed to block and what is their base64 corresponding lines form the dataset file?

ManuelFFF commented 3 years ago

Well, before start, please tell me if I am encoding properly. Which is the right method to use with datasets?

  1. Do I have to encode the whole file, as a unit

$ base64 /etc/suricata/rules/dataset/test-datasetDNS.lst >> /etc/suricata/rules/dataset/test-datasetDNS64.lst

  1. Do I have to encode each individual line within the file, but not the file itself
for i in $(cat /var/local/Suricata-Feeds/DNS/test_dns); do
  echo "$i" | base64 >> /etc/suricata/rules/dataset/test-datasetDNS64v2.lst
done

As for your previous question, I have been trying (with no success) with the four domains within the test dataset

maxmind.com   
www.maxmind.com   
debian.org   
www.debian.org
pevma commented 3 years ago

Can you please share the rules file and the dataste file?

ManuelFFF commented 3 years ago

rules.txt

I had to change the extension for the next file from .lst to .txt due to restrictions to upload files test-datasetDNS64.txt

ManuelFFF commented 3 years ago

Where you able to find anything wrong with my rules or dataset file?

pevma commented 3 years ago

The base64 contents curently are :

bWF4bWluZC5jb20Kd3d3Lm1heG1pbmQuY29tCmRlYmlhbi5vcmcKd3d3LmRlYmlhbi5vcmcK

They should be instead

bWF4bWluZC5jb20=
d3d3Lm1heG1pbmQuY29t
ZGViaWFuLm9yZw==
d3d3LmRlYmlhbi5vcmc=
ManuelFFF commented 3 years ago

That's what I thought in the first place. That way it has more sense to me. So it is clear now that I should be using the 2nd method to encode the dataset. I need to encode individual lines within the file instead of the file itself:


  1. Do I have to encode each individual line within the file, but not the file itself
for i in $(cat /var/local/Suricata-Feeds/DNS/test_dns); do
  echo "$i" | base64 >> /etc/suricata/rules/dataset/test-datasetDNS64v2.lst
done

It's good to know the way to go. Now, what should I try next? None of the encoding methods are currently working. I mean, I have been encoding each line (as described here) and yet Suricata it is not blocking access.

ManuelFFF commented 3 years ago

I am ready to keep trying and troubleshooting. Thanks

pevma commented 3 years ago

Yes, just each individual line (make sure there are no new lines encoded and the result is clean)

ManuelFFF commented 3 years ago

I already went through that path, but let's try one more time:

Content of original test file /var/local/Suricata-Feeds/DNS/test_dns:

maxmind.com
www.maxmind.com
debian.org
www.debian.org

How I encoded the above file to base64:

for i in $(cat /var/local/Suricata-Feeds/DNS/test_dns); do
  echo "$i" | base64 >> /etc/suricata/rules/dataset/test-datasetDNS64v2.lst
done

Content of dataset file /etc/suricata/rules/dataset/test-datasetDNS64v2.lst after encoding:

bWF4bWluZC5jb20K
d3d3Lm1heG1pbmQuY29tCg==
ZGViaWFuLm9yZwo=
d3d3LmRlYmlhbi5vcmcK

Rule matching dataset:

alert dns any any -> any any (msg:"TEST Known Bad Domain"; dns.query; dataset:isset, test-datasetDNS64v2, type string, load /etc/suricata/rules/dataset/test-datasetDNS64v2.lst; sid:17; rev:1;)

Within the same rules file there were the other IPREP rules. Within the same .tar.gz file there was the categories file. Of course, the categories files it is to be used just by the IPREP rules.

What should I try next?

ManuelFFF commented 3 years ago

Do you think there may be a bug in here, or perhaps I'm missing or doing something wrong?

ManuelFFF commented 3 years ago

Hi. I am ready to try new things to narrow this issue. Thanks

pevma commented 3 years ago

something is off with the encoding - this how it decodes form https://github.com/StamusNetworks/SELKS/issues/295#issuecomment-799420540

maxmind.com
www.maxmind.com
FV&&pܹɜ
ManuelFFF commented 3 years ago

I was suspecting that might be happening. I have tested the encoding - decoding with some local and online tools and I can confirm the system is encoding properly. The issue seems to be Suricata or Dataset engine not decoding properly. I hope you can locate the root cause and fix it.

Please let me know if there a patch or anything you want me to try.

Thank you

pevma commented 3 years ago

I think the base conversion is not resulting as expected. Suricata simple decodes base64 the provided info/file, and that -

bWF4bWluZC5jb20K
d3d3Lm1heG1pbmQuY29tCg==
ZGViaWFuLm9yZwo=
d3d3LmRlYmlhbi5vcmcK

translates to that

maxmind.com
www.maxmind.com
�FV&�&pܹɜ

before Suricata starts even matching.

ManuelFFF commented 3 years ago

I see. Well, as I shared before, this is how I'm encoding the data to base64:

for i in $(cat /var/local/Suricata-Feeds/DNS/test_dns); do
  echo "$i" | base64 >> /etc/suricata/rules/dataset/test-datasetDNS64v2.lst
done

Result:

bWF4bWluZC5jb20K
d3d3Lm1heG1pbmQuY29tCg==
ZGViaWFuLm9yZwo=
d3d3LmRlYmlhbi5vcmcK

I tested the results and this is what I found:

  1. If I decode each line individually, the result is correct
- bWF4bWluZC5jb20K                      ->  maxmind.com
- d3d3Lm1heG1pbmQuY29tCg==     ->  www.maxmind.com
- ZGViaWFuLm9yZwo=                    ->  debian.org
- d3d3LmRlYmlhbi5vcmcK                ->  www.debian.org
  1. If I try to decode all four lines at once, then I got error "Malformed input..."

With my script I am reading one line at the time, then encoding the string, then writing the string to the file. Each individual line seems to be encoded properly. I am not sure what is going on here.

Any advice?

ManuelFFF commented 3 years ago

Update

I tried something different. I did not used the script this time.

for i in $(cat /etc/suricata/rules/dataset/test-datasetDNS64v2.lst); do
  echo "$i" | base64 -d >> /etc/suricata/rules/dataset/test-datasetDNSdecoded.lst
done

Result:

maxmind.comwww.maxmind.comdebian.orgwww.debian.org

Note: I am trying to have a base64 encoded file, where each line is a domain to block by Suricata. This is how I understood that needs to be done. If I misinterpreted it and I am not doing it the right way, please correct me or tell me how you would do it.

Thanks

ManuelFFF commented 3 years ago

I have been doing more research about this. Does Suricata decode the base64 file, each line separately or attempts to read the whole file as a single string?

They way I'm encoding now, if I decode the file, each line separately, I will get back exactly the original file before encoding, with the same structure of one domain per line. But if I try to just decode the whole file, I will get the error "Malformed input...".

By default base64 wraps the lines every 76 characters. I could try to force a number of characters per line but my lines will have many different length, and perhaps that affect the structure of the file in a way base64 decoders can read?

I tried to make it simple, but maybe I'm not using the best way to encode the file, trying to keep an encoded line per each domain.

Here is another combination:

echo -n $(cat /var/local/Suricata-Feeds/DNS/test_dns) | base64 >> /etc/suricata/rules/dataset/test-datasetDNS64v2.lst

Result:

bWF4bWluZC5jb20gd3d3Lm1heG1pbmQuY29tIGRlYmlhbi5vcmcgd3d3LmRlYmlhbi5vcmc=

Decoding the above string:

maxmind.com www.maxmind.com debian.org www.debian.org

Could you share how would you do it?

Thanks

ManuelFFF commented 3 years ago

I think I resolved it. I have found a working combination:

for i in $(cat /var/local/Suricata-Feeds/DNS/tss_dns); do echo -n "$i" | base64 >> /etc/suricata/rules/dataset/tss-datasetDNS64v2.lst; done

What changed?

I have forced echo to detect new line characters with option -n. Then this data is passed to the base64 command, who add it to the codification.

When I attempt to decode the whole file as one, it keep saying "Malformed input...", and if I decode each line separately, I will get back exactly the original file before encoding, with the same structure of one domain per line. Nothing changed, except that now Suricata is able to decode properly, it detects the domains in the dataset and blocks the traffic.

What do you think?

pevma commented 3 years ago

I thin you should strip any spaces and new lines first then write it in the file. Would that work?

ManuelFFF commented 3 years ago

I thought you told me the file structure should be one domain per line and each line encoded to base 64. If I have 50 domains, that would be 50 lines, and each line encoded individually before added to the dataset. If I got it wrong, please show me the right way.

Thanks

pevma commented 3 years ago

yes - it is what i meant: 1 - take domain.com ...hidden spaces new line...
2 - strip spaces and new lines to domain.com 3 - convert to b64 4 - write to the dateset file.

ManuelFFF commented 3 years ago

Ok, it's good to know hehehe.... I think that's what I have been doing with the code below. Being carefully that there are not hidden spaces in test_dns before encoding:

for i in $(cat /var/local/Suricata-Feeds/DNS/test_dns); do
  echo "$i" | base64 >> /etc/suricata/rules/dataset/test-datasetDNS64v2.lst
done
pevma commented 3 years ago

Your dataset file should look like that:

bWF4bWluZC5jb20=
d3d3Lm1heG1pbmQuY29t
d3d3LmRlYmlhbi5vcmc=
ZGViaWFuLm9yZw==

Can you try it?

ManuelFFF commented 3 years ago

It looks like it works, but how can I automate the encoding process to have it the same way? Using my script below is also encoding in a way that Suricata can use it.

for i in $(cat /var/local/Suricata-Feeds/DNS/tss_dns); do echo -n "$i" | base64 >> /etc/suricata/rules/dataset/tss-datasetDNS64v2.lst; done

pevma commented 3 years ago

Try removing -n ?