Squiblydoo / debloat

A GUI and CLI tool for removing bloat from executables
BSD 3-Clause "New" or "Revised" License
301 stars 25 forks source link

Improve NSItem object equality check #43

Closed gdesmar closed 3 days ago

gdesmar commented 3 days ago

Modification of the NSItem equality function that can cover the case raised in #42. Two objects are found with the same offset 1785955/0x001b4063 and path $PLUGINSDIR\installer-helper.dll. All of their item in the NSItem class are identical except the name and the prefix. There is name:$PLUGINSDIR\installer-helper.dll and no prefix (None) for one, and name:installer-helper.dll with prefix:$PLUGINSDIR for the other, which explains the identical path and key of (item.path, item.offset) in the items directory. The two items have different instruction.arguments ([83886224, 97, 1785955, 2478026752, 31114788, 4294967259] vs [83886225, 94, 1785955, 4294967295, 4294967295, 4294967259].

I used the ntpath library to generate combination of prefix and name before realizing I could use self.path == other.path. I can change it quickly and remove the import if we want to use this solution.

Squiblydoo commented 3 days ago

I'm going to go with the self.path == other.path option in order to avoid the additional import as you suggested and for better symmetry in that function definition.

However, it is unclear to me, is it working as you expect? I was expecting it to output two files, but it still seems to output one. I feel like the NSIS installer is gaslighting me. I stepped through the whole item dictionary creation and didn't see a duplicate now, only some empty items that are skipped over.

I added a bugfix and the updated meta-data to the PR. If the NSItem behavior makes sense to you, I'll merge it.

gdesmar commented 3 days ago

With this code, there won't be two files for this sample because both NSItem in the NSHeader are going to be equal.

In the NSHeader's __init__(), it adds the first NSItem to the items dictionary with a path of $PLUGINSDIR\installer-helper.dll and the offset. When it tries to add the second item, it does items.setdefault((item.path, item.offset), item) != item that will return False. Both items have the same item.path and item.offset so they share the same key for the items dictionary, but were different python object, so were showing as different for the conditional. Now the dictionary is going to keep the first object that it had, and we'll use that one for the list since they have the same value for every other element.

It does look great to me! Thank you!

Squiblydoo commented 3 days ago

Awesome. Thanks for the explanation. Will try to get it merged and released later today.

Squiblydoo commented 2 days ago

Epilogue: I submitted this improvement up stream to Binary Refinery, since the NSIS parser is based off of the parser in that project: https://github.com/binref/refinery/pull/52