pralab / secml_malware

Create adversarial attacks against machine learning Windows malware detectors
https://secml-malware.readthedocs.io/
GNU General Public License v3.0
203 stars 46 forks source link

DOS extension attack debugging #18

Closed haoliutj closed 2 years ago

haoliutj commented 2 years ago

Describe the bug Hi,

Again, I am still leveraging you tools to do my project, which is an amazing tool!!

Recently, I tried to validate if the functionality really preserved after applied your adversarial attacks (partial/full dos, dos extend and content shift attacks). I used binary editor (HxD) to change the bytes. By following your algorithms, partial/full DOS attacks preserve the functionality, while the DOS extend and content shift attacks not.

Theoretically, these attacks should be functionality-preserved, but I am not sure whether I was missing some steps when applied dos extend and content shift attacks. I am wondering have you ever tried to prove these attacks are actually practical, even though these are make sense in theory. if you did, could you share the way how to validate? if not, could you give some advice and suggestions about the way I did?

Here is my steps for two attacks which are not able to preserve the functionality: DOS extend: 1) modify PE entry point in DOS header based on the amount of extending bytes; 2)shift PE header start offset based on the amount of extending bytes and initial theses new created space with 0 values; 3) shift each section header offset based on the amount of extending bytes. (these steps are based on you code, while there is one step presented in your paper algorithm 3 line5 is not included in above steps, wondering if this is the reason?)

Content shift: 1)shift each section header offset based on the amount of shifting bytes; 2)insert 0 values between PE header and first section based on the amount of shifting bytes

All these steps are based on your code and I applied the corresponding manipulations in binary editor (HxD) to craft the program, however, the modified file is not functional.

Looking forward to your reply1

Thanks,

Hao

zangobot commented 2 years ago

Hello! First of all... yes, I tried them, of course :D These manipulations are not trivial to create, and less to debug. Mmmh, so the DOS header extension might also need to change the size of headers defined in the format. Also, if the sample is packed, you might have problems, as the packer / unpacker routine might do something "fancy" on the file. And also, the dos extension might suffer from not fitting into a memory page (as I discovered later that the header has a maximum size when mapped into memory, but I'm still investigating this).

Try the manipulations with a non-packed file, like compile one hello world yourself (or use a malware that you know it is not packed).

Let me know, maybe I have missed something in the code or in the procedure, thank you for opening this!

haoliutj commented 2 years ago

update: Thanks for the suggestions! I did these attacks on unpacked samples. the content shift attack works, but the size of header should remain as original, otherwise the modified file by content shift attack will corrupt. Have no luck on DOS extension attack.

zangobot commented 2 years ago

Mmmmmh How much content you're adding with the DOS extension? Btw, Content Shift do not require to change the size of headers, as you are not changing the size of any header.

haoliutj commented 2 years ago

tried 512, 1024 to DOS extension. these amount should be multiple of FileAlignment as they will be round up multiple of FileAlignment no matter how many content we defined advance. right?

you are right, the size of headers should keep unchanged for content shift. I did not change the size of headers.

zangobot commented 2 years ago

Mmmh. Which sample is that? I would like to investigate a bit!

haoliutj commented 2 years ago

I attached the samples I tested below. Thank you ~!

putty: https://www.chiark.greenend.org.uk/~sgtatham/putty/releases/0.66.html (the one with 'the SSH and Telnet client itself' under 'Alternative binary files' section) PEviewer: download from github https://github.com/eastmountyxz/SystemSecurity-ReverseAnalysis 010editor: download from github https://github.com/eastmountyxz/SystemSecurity-ReverseAnalysis

zangobot commented 2 years ago

Ok, I'll take some time in the future to apply manipulations on them, thank you for the help!

zangobot commented 2 years ago

First session of debug: I am using calc as test (unpacked sample). Everything work smoothly (both extend and shift). I am now trying the PEView exe you said. Since I'm paranoid, I've uploaded it on VT just to be sure (since the link you sent me point toward a GitHub repo which is not the official source of PEView). One AV flagged it as packed, so I'll start investigating (and hence, one of the comments I already posted may apply).

zangobot commented 2 years ago

Shift attack on PEView is working smoothly. I am using my library for computing, and not by hand. Extend attack on PEView is working smoothly. Same as before. I used 512 as payload size. Did you try to apply adversarial manipulations using the library?

haoliutj commented 2 years ago

Awesome, great to hear you verified extend! Thank you so much for updating!

I did not try the library to apply the adversarial manipulations. I used 010 editor to apply these manipulations by hand.

The reason is, based on my understanding (may wrong), load exe as bytes and save it back to exe, which will corrupt the executability of original exe file. please correct me if I am wrong. (if I am wrong, and this is possible, then it would be great news for me since I am trying to use AV to scan the malware modified by extend or shift attacks (obviously, doing this by hand is infeasible for hundreds of malware))

I am very curious about the way how to verify the extend attack and want to reproduce by myself, I am wondering if you could share the method how you apply the extend attack on PEView or calc.

For the library you mentioned, could you share the specific library, and which function you used? (I am not an expert like you in this area, may ask some easy questions, hope you don't mind, thank you in advance!)

zangobot commented 2 years ago

Well, I used my library to do so! This library! There are some tutorials that I wrote, and I think you can pick them up from there. Also, check if you correctly computed the indexes in the right way (the PE header pointer, summing 4 which is the length of PE00, summing 20 which is the size of the COFF header... and so on). I'm closing the issue for now.