adobe / obfuscation-detection

Apache License 2.0
34 stars 10 forks source link

Considerations for base64 arguments #1

Open pooki3bear opened 2 years ago

pooki3bear commented 2 years ago

Hi, I just saw this awesome project!

My first thought "how will this handle expected base64 argument for program like Chrome or Nvidia?"

Suggestion: Include base64 decode process for arguments before vectorization. This might be finally represented as a string of the valid ascii bytes in cases where binary values are passed. This will most likely also help model accuracy in the case of base64 encoded commands.

Expected Behaviour

Google Chrome Helper (GPU) output label 0

Actual Behaviour

Google Chrome Helper (GPU) output label 1

Reproduce Scenario (including but not limited to)

Steps to Reproduce

Find chrome helper GPU process with args ps -e |grep Chrome |grep GPU Add process string to example code from project, run demo script and output 1.

If I remove the base64 string from the submitted command, the model returns 0 as expected.

tiberiu44 commented 2 years ago

Hi @pooki3bear ,

BASE64 is still a type of obfuscation, so the result should be 1. This is why we cannot return 0 for BASE64 strings. I'm guessing that Google Chrome is actually an exception for you and you don't want it highlighted you should ignore by whitelisting. I'm sure that if a powerscript runs a BASE64 encoded script you will want to see it get flagged.

Hope this helps

pooki3bear commented 2 years ago

Hi @tiberiu44,

In my limited experience there are legitimate uses for windows admins to submit base64 encoded powershell scripts (like in the case of win-rm based tooling)

In the case of invoke-obfuscation, the CaP1TaLiZatIoN and character frequency artifacts might be more valuable for detecting something that was intended to be hidden.

tiberiu44 commented 2 years ago

I agree with you on what you said. However, BASE64 also has illegitimate usage. In fact all obfuscation mechanisms are generally divided between hiding malicious intent or protecting intelectual property/sensitive data. This tool detects obfuscation, not obfuscation for malicious intent. If you are looking for malicious activity, you should focus on other indicators:

For this type of operations we do provide other tools: