Yamato-Security / hayabusa

Hayabusa (隼) is a sigma-based threat hunting and fast forensics timeline generator for Windows event logs.
GNU Affero General Public License v3.0
2.26k stars 200 forks source link

Other: print unique detections by rule ID instead of file path #1111

Closed YamatoSecurity closed 1 year ago

YamatoSecurity commented 1 year ago

@hitenkoku Can you take a look at this?

It seems that the number of unique detections is not correct. How to reproduce:

  1. Add RuleID: "%RuleID%" to config/default_profile.yaml
  2. Scan hayabusa-sample-evtx to a JSONL file ./target/release/hayabusa json-timeline -d ../hayabusa-sample-evtx -L -u -D -o hayabusa-sample.jsonl -C -H hayabusa-sample.html

These are my results:

Total | Unique detections: 33,081 | 685
Total | Unique critical detections: 58 (0.18%) | 23 (3.36%)
Total | Unique high detections: 6,176 (18.67%) | 297 (43.36%)
Total | Unique medium detections: 2,153 (6.51%) | 222 (32.41%)
Total | Unique low detections: 6,531 (19.74%) | 87 (12.70%)
Total | Unique informational detections: 18,163 (54.90%) | 56 (8.18%)

However, when I check the total number of unique rule IDs with:

cat hayabusa-sample.jsonl | jq '.RuleID' -r | sort | uniq > unique-detections.txt
cat unique-detections.txt | wc -l

the result is 605, not 685.

I also checked the HTML report. For example, for critical alerts in the HTML report, there are only 21 unique rules but the Results Summary says there should be 23.

I checked the number of total detections, and those seem to be correct.

Total critical alerts is correct:

cat hayabusa-sample.jsonl | jq 'select (.Level == "crit") | .RuleID' | wc -l
      58

I also checked in the same way the total detections for high, med, low, info and all are correct.

hitenkoku commented 1 year ago

@YamatoSecurity Unique xxx detections is calculated based on the file path of the rule file and not based on the ID of the rule.

I will try to support aggregation based on the ID of the rule.

YamatoSecurity commented 1 year ago

@hitenkoku Ah, I see. That makes sense. In a sense, process_creation, etc... rules are 2 rules in one so I thought it was better to count the unique detections based on rule path but it might be confusing why they don't match up to the number of unique detections and is treated as a single detection in sigma so it is probably better to count the number of unique detections by rule ID instead of rule path.

hitenkoku commented 1 year ago

I understand. Then the number of elements of All xxx alerts in the html file and the number of elements of Unique detection in the Result Summary will not match, is this a problem?

I think it would be better to leave the html file as it is, since the html file is linked to the link so that it can be viewed.

YamatoSecurity commented 1 year ago

@hitenkoku Yes, I think the HTML report is to leave as it is now with links to the rule paths. However, it looks like the Results Summary unique detection number is counting by rule path instead of id? If this is true, can you change the Results Summary unique detection number in the HTML to the same as is being displayed to screen now?

YamatoSecurity commented 1 year ago

@hitenkoku We should also count the following by rule ID instead of path as well:

Excluded rules: 27
Noisy rules: 12 (Disabled)

Deprecated rules: 169 (4.54%) (Disabled)
Experimental rules: 2001 (53.70%)
Stable rules: 230 (6.17%)
Test rules: 1495 (40.12%)
Unsupported rules: 46 (1.23%) (Disabled)

Hayabusa rules: 157
Sigma rules: 3569
Total enabled detection rules: 3726

This is a difficult decision as there will be more rules actually loaded than shown here but most people treat process_creation rules, etc.. as one rule even though they are looking at two different data sources (Sec 4688 and Sysmon 1) so we should probably display the total based on unique rule IDs.

hitenkoku commented 1 year ago

@YamatoSecurity I have created a separate issue for counting the number of rules because we are using a different logic.

Could I address this issue in a separate pull request? #1113

We should also count the following by rule ID instead of path as well:

Excluded rules: 27
Noisy rules: 12 (Disabled)

Deprecated rules: 169 (4.54%) (Disabled)
Experimental rules: 2001 (53.70%)
Stable rules: 230 (6.17%)
Test rules: 1495 (40.12%)
Unsupported rules: 46 (1.23%) (Disabled)

Hayabusa rules: 157
Sigma rules: 3569
Total enabled detection rules: 3726

This is a difficult decision as there will be more rules actually loaded than shown here but most people treat process_creation rules, etc.. as one rule even though they are looking at two different data sources (Sec 4688 and Sysmon 1) so we should probably display the total based on unique rule IDs.

YamatoSecurity commented 1 year ago

@hitenkoku Making a separate issue is a better idea. Thank you!