Closed gdesmar closed 10 months ago
Awesome. Thanks for all the information and the Pull Request. I am in the process of reviewing it and will likely complete review within a few days.
I'm going to go ahead and merge the changes. Everything makes sense from my review.
Do you mind if I include the test.py
within the Debloat repository? It fits some needs that I had not solved for previously.
I'm experimenting using the refinery_trim function in place of the dynamic trim. Practically, if you replace line 390 with delta_last_non_junk = refinery_strip(pe, biggest_section_data)
and line 513 with end_of_real_data = pe.get_overlay_data_start_offset() + refinery_strip(pe, overlay)
it will be fully implemented.
You may want to review the original method in Binary Refinery here, pestrip.py. That will provide more context on its original use.
A factor in regards to the refinery_trim is line 237, the threshold. Debloat has it default to "1" which is maximum aggressiveness. With this setting, I found that the memory usage was reduced by 2/3rds. I believe the default for refinery is actually 0.05 and 0 will only remove repeated bytes. 1 is most aggressive. (The code for modifying the threshold is already in place within debloat.) With thresholds lower than 1, I observed that the trimming usually failed and processing time was often increased to 30 seconds.* There could be a problem with my implementation though. Update Apparently the threshold 1 will remove the whole thing. So in the case of sections or times the malware needs some bytes in the overlay, it is not an acceptable setting.
*NOTE; With the current test.py
script, failure to debloat can be observed when the results output the same hash as the previous analysis.
The hashes in the following table aren't very important since they differ due to different method of removing data. However, in some cases, they do remove the same amount of data.
What I haven't tested is to confirm manually/automatically that no critical data has been removed.
Failed cases below are instances that debloat is unable to handle.
Filename | Size | SHA256 Hash | Size in bytes | Memory Peak | Execution time |
---|---|---|---|---|---|
Overlay-NullBytes3.malz | 762.939MB | 46aeb0... | 14699520 | 3.672GB | [2.72, 2.79, 2.85] |
NON-Null-Overlay1.malz | 815.940MB | 8442e6... | 11912192 | 3.940GB | [13.65, 14.67, 15.6] |
NON-Null-Overlay-Random.malz | 674.557MB | 97fed9... | 566784 | 3.292GB | [12.93, 12.02, 13.05] |
Themida-Overlay-UnknownFamily.malz | 439.726MB | 441ab8... | 2695168 | 2.144GB | [7.95, 8.04, 7.86] |
Resource4.malz | 300.388MB | 150f5e... | 2480030 | 305.179MB | [3.39, 3.59, 3.46] |
Resource.malz | 300.348MB | abf3d2... | 1403904 | 303.209MB | [3.26, 3.24, 3.27] |
Section-UnknownPacker.malz | 705.287MB | abf3d2... | 1403904 | 3.423GB | [7.37, 7.35, 8.03] |
OverlayHighCompression.malz | 734.451MB | abf3d2... | 1403904 | 2.148GB | [1.14, 1.16, 1.12] |
Overlay-AfterSignature2.malz | 738.980MB | 04d8ac... | 868352 | 740.673MB | [0.0, 0.0, 0.0] |
Packed.malz | 726.000MB | 125a8a... | 1499136 | 2.127GB | [4.9, 4.61, 4.78] |
DotNetResource.malz | 307.925MB | ---Processing Failed---- | ------ | ---- | [2.37, 2.33, 2.33] |
Section3.malz | 325.880MB | 7d54aa... | 7214592 | 1.568GB | [3.42, 3.41, 3.46] |
Overlay-Random-Chunks.malz | 604.650MB | f8973d... | 328192 | 2.951GB | [12.39, 12.87, 13.03] |
Overlay-NullBytes2.malz | 762.939MB | 160ea0... | 16570880 | 3.664GB | [2.86, 2.7, 2.71] |
Section2.malz | 709.919MB | 0c5c73... | 952320 | 3.463GB | [7.41, 7.32, 7.18] |
Section1.malz | 1.264GB | f9a08f... | 82432 | 3.793GB | [16.13, 16.09, 15.85] |
Bloat_After_SignatureExample.malz | 636.277MB | e1816c... | 1575424 | 639.319MB | [0.0, 0.0, 0.0] |
Filename | Size | Output SHA256 Hash | Size bytes | Memory Peak | Execution time |
---|---|---|---|---|---|
Overlay-NullBytes3.malz | 762.939MB | 46aeb0... | 14699520 | 1.505GB | [0.48, 0.48, 0.46] |
NON-Null-Overlay1.malz | 815.940MB | 2cabe1... | 11896610 | 1.605GB | [0.47, 0.49, 0.48] |
NON-Null-Overlay-Random.malz | 674.557MB | 82c42d... | 539498 | 1.318GB | [0.42, 0.41, 0.42] |
Themida-Overlay-UnknownFamily.malz | 439.726MB | c23ffc... | 2679856 | 883.933MB | [0.32, 0.32, 0.32] |
Resource4.malz | 300.388MB | 150f5e... | 2480030 | 305.179MB | [3.9, 3.76, 3.42] |
Resource.malz | 300.348MB | abf3d2... | 1403904 | 303.208MB | [3.3, 3.3, 3.71] |
Section-UnknownPacker.malz | 705.287MB | 85fd15... | 5539328 | 1.372GB | [5.94, 5.91, 5.81] |
OverlayHighCompression.malz | 734.451MB | a72adb... | 2127360 | 1.436GB | [0.47, 0.47, 0.49] |
Overlay-AfterSignature2.malz | 738.980MB | 04d8ac... | 868352 | 740.673MB | [0.0, 0.0, 0.0] |
Packed.malz | 726.000MB | 7ff8cf... | 185346 | 1.418GB | [0.47, 0.5, 0.5] |
DotNetResource.malz | 307.925MB | ---Processing Failed---- | ---- | ----- | [2.4, 2.41, 2.34] |
Section3.malz | 325.880MB | 28e62e... | 6166016 | 645.925MB | [2.48, 2.46, 2.47] |
Overlay-Random-Chunks.malz | 604.650MB | 0a40ea... | 300724 | 1.181GB | [0.4, 0.4, 0.4] |
Overlay-NullBytes2.malz | 762.939MB | 160ea0... | 16570880 | 1.506GB | [0.51, 0.49, 0.49] |
Section2.malz | 709.919MB | 0c5c73... | 952320 | 1.386GB | [4.61, 4.64, 4.62] |
Section1.malz | 1.264GB | 89cd0e... | 72192 | 2.529GB | [8.59, 8.46, 8.51] |
Bloat_After_SignatureExample.malz | 636.277MB | e1816c... | 1575424 | 639.319MB | [0.01, 0.0, 0.0] |
I have somewhat more confidence in the Refinery_Strip method than my own Dynamic_trim due to the author's skill; I just haven't confirmed that it works consistently as expected when set to the full aggressive setting.
I plan to do review of this today; I can write all the patched binary to a directory and manually inspect them against their originals to determine the information removed. There may be faster or smarter methods, but this is one that I believe I can complete easily enough.
Some samples, like the Packed examples (or this Emotet ) contain important bytes in the overlay. I'm fairly sure that the Refinery_Strip won't remove them, but it is something I'd like to be certain of. (In my dynamic trim, I often erred on the side of caution and left extra bytes.)
Update: Talking with Jesko we identified that the pestrip as implemented in Binary Refinery was unable to handle the important bytes in the overlay. This commit to Binary Refinery introduced new capability to handle them.
More things make sense with the refinery link. I did see your comments at the top of the file, but did not investigate it before.
Regarding the threshold in refinery_strip, I tried to keep my changes to a minimum, but the whole if 0 < threshold < 1:
code block is not accessible. I see now that it's an artefact from refinery, and that you may re-enable it at some point.
Regarding test.py, I would be glad if it was added directly to the repository (albeit with a better name). If it was to be used more often, it would probably be best to make it a bit nicer, like using tempfile for the memray.bin
and out
files, or at least clean those up after the execution of the script, and exit with a warning before overwriting/deleting them.
Hi, I tried to reduce the memory footprint of the library, without sacrificing speed. I had other improvements that were completed but reverted back as they were causing the library to take longer to execute, mostly related to the trim_junk function. I may revisit them later, but I wanted to get those in. My first goal was to remove all instances of pe.write() and the pe_data that is duplicated at the start of the process_pe function. In place of the pe_data, I keep a list of offset tuple (from, to) that we wish to delete, and then, assuming that we wish to create the resulting file, only get the bytes wanted from the original data.
One thing that may cause problem, is the addition of another parameter to the process_pe function. If whoever calls process_pe can give the length of the file, that would save a whole pe.write() (and therefore another full load in memory). I made it optional and at the end of the arguments, to be backward-compatible.
I do not have samples to test all code path. I don't mind running all samples with 1.5.0 and this new branch to compare the results if you can share some, or if you want to do it yourself. You can download the raw test.py (in a zip file for github) that I used. It is not well documented, but should give an insight on how I got my results, and how anyone could try to reproduce them.
I am looking forward to any feedback!