Closed KRUXLEX closed 1 year ago
It should be storing as little information as possible to keep RAM usage to a minimum but there must be some oversight somewhere. I'll profile it when I get some time and fix the issue if I can replicate it.
Quick one, do you get insane RAM usage if you output in json (-j)?
Just check. Yup on json is similar
Okay, thanks for checking I know exactly what the issue is. I'll try and get a solution in place this coming week.
Thanks, actually work around is bash "for loop" :D
@KRUXLEX , if you are able to test out the fix/memory branch that would be great (You will need to build with cargo build --release
). Unfortunately this fix is always going to have to sacrifice the cost of speed for space, but the slowdown should not be too bad. There are some CPU optimisations that can be done to get closer to the original performance which I will try and do when I have time.
This is now in master.
Still it eating my RAM :)
Can you please give me an example of what you are running Chainsaw on? The only case I can think of where it will eat that much RAM is on a very noisy ruleset (default sigma and the all mapping file) on a very large set of data. This is because it currently stores all hits in memory for presentation at the end.
What I could do to handle these cases is add an option that asks Chainsaw to page the hits to disk. This will result in a slower run but will use far less memory.
A temporary work around is to output the data as jsonl
as this will stop the caching for terminal presentation.
To summaries, lob me your usecase and I will se what I can do.
I think I've hit the same issue. Hunting on an old domain controller, 251 evtx files and a 3.9GB security event log. Default Sigma and all mapping file. Output to csv or jsonl doesn't make a difference.
That will be the issue, a 3.9GB will explode in size when in RAM, especially with the default sigma ruleset. And currently --jsonl
only works at the per file boundary rather then each entry in a file.
I'll go to the drawing board and see what I can come up with current thinking, is to output --jsonl
at the entry boundary. But again this would not solve the problem for normal output.
Note that you can work around the issue by using --from
and --to
to only analyze a subset of events.
Hello,
It is using 50% CPU and more than 1gb RAM. I am using it between 5 minutes interval that matches events against 1302 sigma rules. Is it possible to optimize the program to not consume that much resources?
Right so there are things we can do here but they depend on users use cases, my current thoughts on potential solutions:
--page
, which would store results to disk if a certain amount of RAM is exceeded, the caveat here would be speed, as paging to disk will slow chainsaw down.json
& jsonl
.I am happy to implement either or maybe both but only if they match peoples use cases as it will take me a bit of time to get these features in.
Seems like both are great solution. The use case is that the program will read event logs from past 5 minutes and matches them against sigma rules every 5 minutes and output them in json format. We would prefer stability of resources over speed.
Okay, let me see what I can do.
Right peeps, please give 2.7.0
a try using a combo of --cache-to-disk --jsonl
. During my tests RAM usage for a 350MB file went from 490MB down to 314MB. Base RAM usage is 240MB for the default sigma ruleset due to how pre-processing of that data is being done for now, as again speed was prioritised over space.
Hello bro,
I cannot download the package. It is being flagged as trojan. I can download older versions without issue.
This will be the case if you download the all
zipped bundle as it contains Sigma rules and poor AV engines will flag on them. The best way around this is to download the rules from Sigma and just get the binary for Chainsaw from releases.
I did a test and I notice the RAM usage is still high and CPU usage fluctuates. It reaches as high as 95%.
Thanks for testing, just some follow up questions. What command line arguments did you use? And what size 'evtx' did you run it on?
I launched the following command:
.\chainsaw.exe hunt C:\Windows\System32\winevt -s C:\rules --mapping C:\chainsaw\mappings\sigma-event-logs-all.yml --json --cache-to-disk
The total evtx size is 210MB.
Then I ran the command specifying a 5 minute timeframe and the CPU usage were around 50% and 1+gb ram.
I think it could also be nice if we were able to select a specific evtx file instead of a folder path. e.g selecting only 'Microsoft-Windows-Sysmon%4Operational.evtx' file because many sigma rules depends on sysmon.
You need to use --jsonl
not --json
, but Chainsaw should be preventing the invalid combinations anyway so that is a bug that needs fixing.
Chainsaw already supports selecting individual files, are you sure the path you used there is valid?
I achieve same results using jsonl. CPU and RAM usage still high. I confirm the path I used is valid. If I remove the file name, it loads all evtx within the folder but specifying a specific evtx file within the folder does not function. I have not tried chainsaw on linux. My use case is windows mostly.
Did you use single quotes? Powershell is notoriously bad with string escapes, there are examples of this in other closed issues.
CPU usage will be high, its a multi threaded application (you can set it to single threaded in the args), but with --cache-to-disk
the RAM usage should be dropping down to around 500MB (when I tested on a 300MB sample). There is a chance that Windows is handling its memory differently to MacOS and Linux. Were you still getting 1GB+ even with -c
and --jsonl
?
I do ok
the error so there could also be a chance that Windows is failing to create a temporary file, I will improve the debugging output for that too so that can be ruled out.
Using --num-threads reduced CPU usage by 50%. However, even with -c --jsonl the RAM usage is still the same (1gb+)
Okay so I have managed to optimise it a little further, that is present in v.2.7.1
, there is probably some more I can do with it but I am waiting on a really large evtx
to make optimisation easier.
RAM usage has decreased by 50%. It seems good.
In version 2.x I,m experiencing a out of memory issue. When i I analyze more that 3-4 files it start eating RAM memory. It's so hungy:
Issue reproduce:
Monitor term in hunt time:
I think, the problem is with reporting detection. It's keep it in memory and after finish all analyze it push it to file. With one or 2-3 files is fine, but if we analyze multiple files there out of memory is possible. I think it should push result to file each confirmed hunt