DexPatcher / multidexlib2

Multi-dex extensions for dexlib2
https://dexpatcher.github.io/
GNU General Public License v3.0
68 stars 34 forks source link

[QUESTION] Efficient way of finding all method references #6

Closed Unbrick closed 3 years ago

Unbrick commented 3 years ago

Hey there,

i'm currently trying to build a simple deobfuscator which is able to rename a method and all of it references.

I already investigated in dexlib2 and mdexlib2 and was successful at patching a method name itself. Now i'm a little stuck while trying to find a efficient way of finding all method references for the renamed method. I already found your way of patching method references but finding all calls to one single class method seems more difficult. Do you have a simple approach to it?

Thank you in advance!

Lanchon commented 3 years ago

latest dexpatcher beta. read the release notes. use one of the mapping transforms.

Lanchon commented 3 years ago

if what u want is programmatically doing this with dexlib2 or multidexlib2, i can point u in a different direction

Unbrick commented 3 years ago

I'm currently working on my thesis regarding static deobfuscation of android apps. This includes sanitisation of non-ascii character methods and class names and optimisation of control flow obfuscations.

Therefore i'm trying to create a deobfuscator which does not break any byte code references or simmilar. If you got any documentation or help how to efficiently work with mdexlib2 that would be awsome!

I already got a partially working deobfuscator for non-ascii method names by iterating through all instructions of a APK file, checking whether it is a ReferenceInstruction (or DexBackedReferenceInstruction) and rewriting the reference of it. But it seems to be there has to be a more effifient way of dealing with all ReferenceInstructions of a DexFile.

Lanchon commented 3 years ago

hey, sorry for the delay.

first: layout deobfuscation WILL NOT produce, in general, working code. but it will produce analyzable code.

why?

what i do is deobf the code, analyze and patch, then reobf the code. the patching process produces logs in the deobf naming world.

for distributing the patch, i simply obf the patch and have people apply it directly.

I'm currently working on my thesis regarding static deobfuscation of android apps.

great. are you using machine learning to automatically recover names? this has been done with very good results. unfortunately, the prototype system was not free software last time i checked.

I already got a partially working deobfuscator for non-ascii method names by iterating through all instructions of a APK file, checking whether it is a ReferenceInstruction (or DexBackedReferenceInstruction) and rewriting the reference of it. But it seems to be there has to be a more effifient way of dealing with all ReferenceInstructions of a DexFile.

well, in dexlib2, you are mostly supposed to create DexFile subobjects on demand when traversing the tree and have them garbage collected as soon as they are not longer referenced. you typically create a transformed view, rather than copying the tree and modifying it. for example. the DexBackedDexFile you use to read dex files is a DexFile object that creates subobjects on demand form a backing byte array containing the dex file contents. creating the objects eagerly would consume lots of memory.

it seems to be there has to be a more effifient way of dealing with all ReferenceInstructions of a DexFile

if you do the create-view-rather-than-copy thing, then this is the most efficient way of achieving your goal. but... most of what needs be done is already done for you in dexlib2!

take a look at this: https://github.com/JesusFreke/smali/blob/master/dexlib2/src/main/java/org/jf/dexlib2/rewriter/DexRewriter.java#L43-L67

CAVEAT: that sample used to have a bug that did not process arrays of the renamed type. i reported this in a github issue, look for it. i know code was committed to the project to fix the issue as a result of my report, but i don't know if the sample was updated to use the new code. look for my issue and read the thread.

Unbrick commented 3 years ago

Thank you for your detailed response! I'll work through it and report back as soon as i get something to work!

My main goal is to build up a graph based on smali instructions and trying to figure out which paths are redundant, unneccessary or can be computed before runtime to remove stub instructions placed by DexGuard for example. This should help analyzing obfuscated code to figure out what the code does.

Additionally i want to thank you for your awesome work with DexPatcher. I used that project to create JodelPatched but the build chain broke due to obfuscation. I hope i can get it back up running using your new beta update!

Lanchon commented 3 years ago

I hope i can get it back up running using your new beta update

the groundwork is there. i need to write the new gradle plugin and docs and samples. :(