Squiblydoo / debloat

A GUI and CLI tool for removing bloat from executables
BSD 3-Clause "New" or "Revised" License
301 stars 25 forks source link

Reduce memory usage #18

Closed gdesmar closed 10 months ago

gdesmar commented 11 months ago

Hi, I tried to reduce the memory footprint of the library, without sacrificing speed. I had other improvements that were completed but reverted back as they were causing the library to take longer to execute, mostly related to the trim_junk function. I may revisit them later, but I wanted to get those in. My first goal was to remove all instances of pe.write() and the pe_data that is duplicated at the start of the process_pe function. In place of the pe_data, I keep a list of offset tuple (from, to) that we wish to delete, and then, assuming that we wish to create the resulting file, only get the bytes wanted from the original data.

One thing that may cause problem, is the addition of another parameter to the process_pe function. If whoever calls process_pe can give the length of the file, that would save a whole pe.write() (and therefore another full load in memory). I made it optional and at the end of the arguments, to be backward-compatible.

I do not have samples to test all code path. I don't mind running all samples with 1.5.0 and this new branch to compare the results if you can share some, or if you want to do it yourself. You can download the raw test.py (in a zip file for github) that I used. It is not well documented, but should give an insight on how I got my results, and how anyone could try to reproduce them.

Here are my results (with passing in the beginning_file_size): File hash File size Peak Memory 1.5.0 Execution time (3 runs) 1.5.0 New Peak Memory New Execution time
67b54f709895fa88b5153b568e62df5fb866237a1b3050502e7bee95a5a41738 482.577MB 1.884GB [5.21, 5.26, 5.19] 485.421MB [3.26, 3.25, 3.32]
49c95279a836da84ad244a05817ab6fa1d8f6cb40a6c7fee4634e345c0e4a5b4 100.000MB 595.951MB [0.73, 0.72, 0.72] 495.951MB [0.52, 0.53, 0.51]
f9bf1e19763fd30242be3f80495518da5aa4604dd5085ce260dbb12d5dd67488 43.152MB 239.493MB [0.31, 0.33, 0.32] 196.341MB [0.24, 0.23, 0.24]
65669e873a3732f1617c9c80667a1c3efda5f72538b5abd475e80a25efc0e5e2 313.823MB 641.474MB [0.45, 0.44, 0.42] 341.446MB [0.05, 0.04, 0.04]
76f7f979a7af7f69eea4ab32e232d2c89dfbf7d0468736582b46a87c855a2422 80.198MB 468.657MB [0.57, 0.6, 0.57] 388.460MB [0.43, 0.42, 0.42]
6a6f3488fa5927539aa37ad12a668f77ce8725534f3e30168fa2d92dde9add89 31.964MB 187.968MB [0.19, 0.18, 0.19] 156.004MB [0.18, 0.17, 0.16]
90ffb9eade13d75f95e25c0b0aaa9a1f9171849cb81f1e2e9494c1fa801deee1 353.281MB 2.066GB [2.73, 2.65, 4.76] 1.721GB [1.73, 1.71, 1.74]
36c32162148bf6fe8785020d68300d10f223ad59b47e6f4fedf7bf78f992f014 400.000MB 2.343GB [3.17, 3.12, 3.13] 1.952GB [2.06, 1.91, 1.94]
c3bbcf49833323978f3df6a3ae4d27cd278930ca78c5b178d6c7558c0b6210a2 500.000MB 2.926GB [4.17, 4.03, 4.03] 2.438GB [2.57, 2.48, 2.49]
9892c1e9c834cf5f2c580baa34ba27d9f9d024cbac89bd2f226b1f723582bbb8 302.261MB 1.179GB [5.14, 5.12, 5.24] 306.829MB [4.31, 4.26, 4.27]
c6fda8a049ebd7872358acfa2505f226e931e0f71090c19412e7b6d0a1c6e129 302.368MB 1.179GB [5.15, 5.28, 5.26] 307.127MB [4.2, 4.27, 4.21]
9900f584d89ef25cdae93a64eb5243df98fc787b006f846f11582a8b150353fc 84.705MB 491.547MB [1.95, 1.9, 1.89] 406.842MB [1.61, 1.58, 1.59]
347248cacef4596adbcddb5dbba62e050ddf223548834e8646bf43f96552a328 300.000MB 1.756GB [2.22, 2.21, 2.2] 1.463GB [1.46, 1.47, 1.42]
9adeeeb9e86d4fa02ac88515131b89b2e912a79e9c0481e0e1254a6a70fd3512 118.460MB 709.343MB [3.21, 3.02, 3.18] 590.883MB [2.78, 2.69, 2.63]
bc1f3d36f8bb9afa4a1a2dfee41fd592b2896865329ce75d78b7fdada774ba8e 35.340MB 206.358MB [0.31, 0.28, 0.28] 171.018MB [0.22, 0.22, 0.24]
b4bd0f04813e92852bb4344b2ce9d15259e628c850bfe3b5e0536977fe6a523d 300.000MB 1.755GB [2.3, 2.19, 2.19] 1.462GB [1.58, 1.66, 1.69]
158d07ab617c101fe9bda772225e07451b06399e1bc240d657c5b5f2f3fc03be 39.323MB 182.218MB [1.75, 1.68, 1.7] 39.683MB [1.63, 1.65, 1.66]

I am looking forward to any feedback!

Squiblydoo commented 10 months ago

Awesome. Thanks for all the information and the Pull Request. I am in the process of reviewing it and will likely complete review within a few days.

Squiblydoo commented 10 months ago

I'm going to go ahead and merge the changes. Everything makes sense from my review. Do you mind if I include the test.py within the Debloat repository? It fits some needs that I had not solved for previously.

In regards to further improvements:

I'm experimenting using the refinery_trim function in place of the dynamic trim. Practically, if you replace line 390 with delta_last_non_junk = refinery_strip(pe, biggest_section_data) and line 513 with end_of_real_data = pe.get_overlay_data_start_offset() + refinery_strip(pe, overlay) it will be fully implemented.

You may want to review the original method in Binary Refinery here, pestrip.py. That will provide more context on its original use.

A factor in regards to the refinery_trim is line 237, the threshold. Debloat has it default to "1" which is maximum aggressiveness. With this setting, I found that the memory usage was reduced by 2/3rds. I believe the default for refinery is actually 0.05 and 0 will only remove repeated bytes. 1 is most aggressive. (The code for modifying the threshold is already in place within debloat.) With thresholds lower than 1, I observed that the trimming usually failed and processing time was often increased to 30 seconds.* There could be a problem with my implementation though. Update Apparently the threshold 1 will remove the whole thing. So in the case of sections or times the malware needs some bytes in the overlay, it is not an acceptable setting.

*NOTE; With the current test.py script, failure to debloat can be observed when the results output the same hash as the previous analysis.

The hashes in the following table aren't very important since they differ due to different method of removing data. However, in some cases, they do remove the same amount of data.

What I haven't tested is to confirm manually/automatically that no critical data has been removed.

Failed cases below are instances that debloat is unable to handle.

Memory Improvements + Debloat's dynamic Trim:

Filename Size SHA256 Hash Size in bytes Memory Peak Execution time
Overlay-NullBytes3.malz 762.939MB 46aeb0... 14699520 3.672GB [2.72, 2.79, 2.85]
NON-Null-Overlay1.malz 815.940MB 8442e6... 11912192 3.940GB [13.65, 14.67, 15.6]
NON-Null-Overlay-Random.malz 674.557MB 97fed9... 566784 3.292GB [12.93, 12.02, 13.05]
Themida-Overlay-UnknownFamily.malz 439.726MB 441ab8... 2695168 2.144GB [7.95, 8.04, 7.86]
Resource4.malz 300.388MB 150f5e... 2480030 305.179MB [3.39, 3.59, 3.46]
Resource.malz 300.348MB abf3d2... 1403904 303.209MB [3.26, 3.24, 3.27]
Section-UnknownPacker.malz 705.287MB abf3d2... 1403904 3.423GB [7.37, 7.35, 8.03]
OverlayHighCompression.malz 734.451MB abf3d2... 1403904 2.148GB [1.14, 1.16, 1.12]
Overlay-AfterSignature2.malz 738.980MB 04d8ac... 868352 740.673MB [0.0, 0.0, 0.0]
Packed.malz 726.000MB 125a8a... 1499136 2.127GB [4.9, 4.61, 4.78]
DotNetResource.malz 307.925MB ---Processing Failed---- ------ ---- [2.37, 2.33, 2.33]
Section3.malz 325.880MB 7d54aa... 7214592 1.568GB [3.42, 3.41, 3.46]
Overlay-Random-Chunks.malz 604.650MB f8973d... 328192 2.951GB [12.39, 12.87, 13.03]
Overlay-NullBytes2.malz 762.939MB 160ea0... 16570880 3.664GB [2.86, 2.7, 2.71]
Section2.malz 709.919MB 0c5c73... 952320 3.463GB [7.41, 7.32, 7.18]
Section1.malz 1.264GB f9a08f... 82432 3.793GB [16.13, 16.09, 15.85]
Bloat_After_SignatureExample.malz 636.277MB e1816c... 1575424 639.319MB [0.0, 0.0, 0.0]

Memory Improvements + Refinery Strip

Filename Size Output SHA256 Hash Size bytes Memory Peak Execution time
Overlay-NullBytes3.malz 762.939MB 46aeb0... 14699520 1.505GB [0.48, 0.48, 0.46]
NON-Null-Overlay1.malz 815.940MB 2cabe1... 11896610 1.605GB [0.47, 0.49, 0.48]
NON-Null-Overlay-Random.malz 674.557MB 82c42d... 539498 1.318GB [0.42, 0.41, 0.42]
Themida-Overlay-UnknownFamily.malz 439.726MB c23ffc... 2679856 883.933MB [0.32, 0.32, 0.32]
Resource4.malz 300.388MB 150f5e... 2480030 305.179MB [3.9, 3.76, 3.42]
Resource.malz 300.348MB abf3d2... 1403904 303.208MB [3.3, 3.3, 3.71]
Section-UnknownPacker.malz 705.287MB 85fd15... 5539328 1.372GB [5.94, 5.91, 5.81]
OverlayHighCompression.malz 734.451MB a72adb... 2127360 1.436GB [0.47, 0.47, 0.49]
Overlay-AfterSignature2.malz 738.980MB 04d8ac... 868352 740.673MB [0.0, 0.0, 0.0]
Packed.malz 726.000MB 7ff8cf... 185346 1.418GB [0.47, 0.5, 0.5]
DotNetResource.malz 307.925MB ---Processing Failed---- ---- ----- [2.4, 2.41, 2.34]
Section3.malz 325.880MB 28e62e... 6166016 645.925MB [2.48, 2.46, 2.47]
Overlay-Random-Chunks.malz 604.650MB 0a40ea... 300724 1.181GB [0.4, 0.4, 0.4]
Overlay-NullBytes2.malz 762.939MB 160ea0... 16570880 1.506GB [0.51, 0.49, 0.49]
Section2.malz 709.919MB 0c5c73... 952320 1.386GB [4.61, 4.64, 4.62]
Section1.malz 1.264GB 89cd0e... 72192 2.529GB [8.59, 8.46, 8.51]
Bloat_After_SignatureExample.malz 636.277MB e1816c... 1575424 639.319MB [0.01, 0.0, 0.0]
Squiblydoo commented 10 months ago

I have somewhat more confidence in the Refinery_Strip method than my own Dynamic_trim due to the author's skill; I just haven't confirmed that it works consistently as expected when set to the full aggressive setting.

I plan to do review of this today; I can write all the patched binary to a directory and manually inspect them against their originals to determine the information removed. There may be faster or smarter methods, but this is one that I believe I can complete easily enough.

Some samples, like the Packed examples (or this Emotet ) contain important bytes in the overlay. I'm fairly sure that the Refinery_Strip won't remove them, but it is something I'd like to be certain of. (In my dynamic trim, I often erred on the side of caution and left extra bytes.)

Update: Talking with Jesko we identified that the pestrip as implemented in Binary Refinery was unable to handle the important bytes in the overlay. This commit to Binary Refinery introduced new capability to handle them.

gdesmar commented 10 months ago

More things make sense with the refinery link. I did see your comments at the top of the file, but did not investigate it before. Regarding the threshold in refinery_strip, I tried to keep my changes to a minimum, but the whole if 0 < threshold < 1: code block is not accessible. I see now that it's an artefact from refinery, and that you may re-enable it at some point. Regarding test.py, I would be glad if it was added directly to the repository (albeit with a better name). If it was to be used more often, it would probably be best to make it a bit nicer, like using tempfile for the memray.bin and out files, or at least clean those up after the execution of the script, and exit with a warning before overwriting/deleting them.