horsicq / die_script

MIT License
10 stars 5 forks source link

This engine could be expanded to be a malware scanner engine #1

Open dmknght opened 2 years ago

dmknght commented 2 years ago

I've researched on malware scanner engine a lot recently and I found out this engine could be expanded. There are some good points

  1. The only open-source Malware scanner engine is ClamAV, which uses some text database formats (and 1 bytecode - basically compiled objects - format. Bytecode is not very widely used). The problem of this format is slow loading db: it loads everything into memory so it takes a lot of time. This method also takes a lot of RAM ( >1gb). To resolve the RAM issue, ClamAV's bytecode signature is a solution, but its api is limited. And then I found out this engine is having kinda many useful api, similar to Yara's modules. The QTScriptEngine could be slower than compiled COFF file method, but in other hand it has so many pros.
  2. The engine is having useful api to check binary file metadata. It could extend more to have things like section hashing, imphash (already in Detect-it-easy). With current api, I think I can demo my idea with this script image
  3. To be good malware scanner engine, I think pattern matching is a required thing. Both Yara and ClamAv are having custom Aho-Corrasick algth, custom syntax with regex implement. This is very huge and I just mention it lul.
  4. Ofc it needs unpackers, other file format parsers and more. So let say that the die_script have docx parser, we can write a simple script like docx.hasMacro() and docx.findMacroStr("exec"). Sounds cool, right?

2 examples about malware signatures similar to my idea:

  1. Decompiled old Kaspersky signature, image get from Antivirus Hacker handbook image
  2. 2 signatures from Windows defender. Research here https://github.com/commial/experiments/tree/master/windows-defender/VDM image

So that's my "little" idea. I can try fork this project, add some api and try standalone script engine. What do you think about my idea? Is it doable?

dmknght commented 2 years ago

First successful signature using section has calculation :D And i found out PE is having imphash in API. Seems like this is a doable idea. However, the execution speed is kinda slow imo image image

horsicq commented 2 years ago

Ideas seem great! I will take a look at ClamAV engine. I think it will be doable.

dmknght commented 2 years ago

Ideas seem great! I will take a look at ClamAV engine. I think it will be doable.

Thank you :D I'm forking this project and try adding some minor changes for easier use. Is there any way to list all API functions (for scripting)? I think it'd be great for both learning die scripting and modifying the engine.

horsicq commented 2 years ago

// Is there any way to list all API functions (for scripting)?

All "public slots" functions could use for scripting.

For example for "PE" it is all "public slots" in https://github.com/horsicq/die_script/blob/master/pe_script.h https://github.com/horsicq/die_script/blob/master/msdos_script.h https://github.com/horsicq/die_script/blob/master/binary_script.h

Because PE is subclass from MSDOS and MSDOS is subclass from Binary

Feel free to add new public slots if you need.

dmknght commented 2 years ago

Thank you! I'm getting the basic idea. Seems like die_script.cpp and die_scriptengine.cpp are 2 script controller modules. I think i'm going to write some "malware signatures" with die scripting engine to understand it more before actually modifying the forked module :D

dmknght commented 2 years ago

During the research about Scripting Engine, I found this https://duktape.org/. Well I'm giving it a try. I think it could be faster than QTEngine

horsicq commented 2 years ago

Nice! Thank you. If it is faster than qtengine we could use it.

dmknght commented 2 years ago

If it is faster than qtengine we could use it.

Well i don't really know if it's faster but it's very small. I checked QTEngine and the page says QtEngine is going to be replaced by QJSEngine in the future so i think you have to replace the QtEngine anyway. I mean duktape could be a good move but since you are using qt for GUI, So use scripting engine from Qt Framework is not bad at all. To me I think if i'd go with duktape, i'll try Nim lang (https://nim-lang.org/) for the engine controller. So the structure is:

  1. File parser (mostly binary parser by now) with api -> C
  2. Duktape engine to execute scripts
  3. Script manager (or db manager) in Nim.

But if I use this structure, it'd be hard for you and me to do co-op on this engine :D p/s: this is very new to me. I'm starting from scratch so the research time will be long :D

horsicq commented 2 years ago

// But if I use this structure, it'd be hard for you and me to do co-op on this engine :D p/s: this is very new to me. I'm starting from scratch so the research time will be long :D

No problem. If something interesting comes up, we'll find a way to use it. :) In extreme cases, it will be possible to make a completely new engine for scanning.

dmknght commented 2 years ago

I found this project while googling about duktape alternative. It seems like a very straight forward engine to do custom scripting language but the documentation is very poor https://www.angelcode.com/angelscript/sdk/docs/manual/doc_hello_world.html

p/s: Also found this performance comparison about duktape and the other js engine. Seems like duktape is not the fastest engine. IDK should i give angelscript a try https://bellard.org/quickjs/bench.html p/s2: Found simple AngelScript benchmark test and it looks promissing https://discourse.urho3d.io/t/angelscript-vs-lua-benchmark/4310/6 p/s3: I tried samples of AngelScript and got segment fault. What a disappointment I have to say... Maybe duktape is the most stable solution for now

horsicq commented 2 years ago

Thanks for the information! I will try to compile AngelScript too.

dmknght commented 2 years ago

Oh sorry for the very long delay ;D I have to complete the other project. Finally I can go back to this research. I used duktape-nim to generate latest binding for duktape 2.7.0. The example can be found here https://github.com/manguluka/duktape-nim/blob/master/tests/basic_eval.nim

I don't really know if the duktape and qt_engine has similar structure to execute standalone script. I must dig deeper into this. p/s: I don't know where to start. Should I try convert all cpp modules to be compatible with duktape or I should try rewrite functions from scratch (with C?)

horsicq commented 2 years ago

Thanks a lot! I will take a look.

modz2014 commented 8 months ago

i started to do something simular