DexPatcher / dexpatcher-tool

Android Dalvik bytecode patcher.
https://dexpatcher.github.io/
GNU General Public License v3.0
426 stars 79 forks source link

Unable to map class interface declarations #35

Open andrewleech opened 4 years ago

andrewleech commented 4 years ago

Hi there, I've been working on automating the mapping.txt generation for one of my projects lately and have run into an issue with some of the --map-source --map mapping.txt usage. Not sure if it's a bug or a setting I'm missing.

In a situation like

import com.foo.iface;
[noimplements.zip](https://github.com/DexPatcher/dexpatcher-tool/files/4997762/noimplements.zip)

public final class bar extends par implements iface {
    public iface instance$;
}

with com.foo.iface

package com.foo;

public interface iface<P1, P2, R> {
    ...
}

I'm trying to map iface to myface, eg mapping.txt

com.foo.iface -> com.foo.myface
java -jar dexpatcher-1.8.0-beta1.jar --map-source --map mapping.txt --output patched.dex out.dex

Now in patched.dex com.foo.iface class is correctly renamed to com.foo.myface

package com.foo;

public interface myface<P1, P2, R> {
    ...
}

and in the original class, the instance variable type is changed, however the implements details haven't:

import com.foo.iface;
import com.foo.myface;

public final class bar extends par implements iface {
    public myface instance$;
}

Is there a --pre-transform or similar setting I'm missing?
I've attached the reproduction case files listed here if it is a bug / missing feature.

Thanks! noimplements.zip

Lanchon commented 4 years ago

hi, thanks for reporting.

implements iface

as described, this is a bug.

Is there a --pre-transform or similar setting I'm missing?

no.

strange... as there is no patch file(s) specified in the command line, dxp is just applying transforms. the strange thing is that class name rewriting is mostly a responsibility of dexlib2. my code rewrites class names/references generally without knowing what the references are (a class definition, an extends clause, an implement clause, part of a member reference, etc). so generally speaking, it is odd (but not impossible) that some names/refs are rewritten and other are not. (yes, this being a dexlib2 bug would explain things, but i consider it unlikely.)

i do not remember the first thing about the code that i wrote a few months ago. i will have to look into this further...

andrewleech commented 4 years ago

Cheers, yeah my current "process" as such is running dexpatcher in a script with the mapping file like this against the original app dex files to create a remapped copy of the dex files. I then decompile the remapped dex files to create a full set of example source files to use as reference for exploring and creating patches. The patches are applied with dexpatcher-gradle the normal way.

Lanchon commented 4 years ago

my current "process"

this is similar to one workflow planed for the next dxp-gradle... when it happens

Lanchon commented 4 years ago

ok, so sorry for taking this long.

so today i got to take a look at this. after a quick look dexlib2 didn't seem to be causing this. next i looked at my code... man! what the heck, it took me a long while to understand what was going on there after a few months of not looking. way too involved, i don't think i'll be able to write another transform if i ever needed to.

anyway, i looked at the code for two whole hours and couldn't find the bug. then i turned to your sample and... there is no bug! everything works as intended. the bug is in the decompiler you used.

take a look at your own patched.dex... image

CFR outputs implements iface<ge7, s67<? super v47> but Krakatau gets it right and outputs implements com.foo.myface. why this? as you can see in the smali source at the far right, the class indeed implements myface but the signature refers to iface.

the signature is (the equivalent of) a string attribute. in java bytecode it encodes a generic type using a defined syntax before type erasure generates the raw type that is actually used in bytecode. AFAIK it is ignored by the JVM and is only used by the reflection API to provide info on generic types.

in dalvik bytecode it is even less: a string of unspecified content.

this is why:

decompilers need signatures in order to better decompile generic code. but they should know that their content could be missing, incorrect, or doctored to attack the decompiler.

this is a CFR bug (and also a Procyon bug). this bug MAY have been fixed in current CFR, as i remember a discussion relative to bad signatures.

i have no plans to map signatures in dxp. i expect properly configured obfuscators to strip them: there seems to be (almost) no reason to leave all that valuable info in bytecode if it it's (mostly) ignored by the VM. i would support them if dexlib2 did, but it doesn't.

obfuscators could fake signatures instead of just strip them. it could change ArrayList<HashMap> into ArrayList<Map> or List<HashMap> (if those types were obfuscated) in order to create subtle decompilaton errors. the only defense for this is to have the decompiler ignore the signatures altogether. yes, you loose generic decompilation (until such time when decompilers can infer genericity based on bytecode alone), but at least decompilation will be correct. and again, i ASUME most obfuscators should strip sigs anyway.

what to do:

(but even with sanity checks, decompilation of mapped code will be incorrect of the mapped type is a type parameter. the only solution to get correct code would be to disable signature processing.)

andrewleech commented 4 years ago

Ah, thanks so much for delving down into it, and sorry for asking about a bug that wasn't really from your codebase!

The obfuscator being used here clearly writes signatures that match the new obfuscated names. If the mapping functions in dexlib /dexpatcher aren't going to be able to re-write them to match the new names I should look for a way to disable their use entirely in the decompilers, or pre-process the dex/jar to manually strip them out.

I was already using the latest CFR with signature verification updates, though I might not have been using the new signature fixup switch properly in this test here - I was initially using jadx and getting the same results so hadn't thought it was the decompiler again.

Thanks again, you've been a big help!

andrewleech commented 4 years ago

Actually, I just re-tried a build from the current CFR master with the new switch to turn off signatures and it still gives the incorrect implements iface output: java -jar ..\cfr-0.151-5569f067.jar --usesignatures false .\patched-dex2jar.jar

Hi @leibnitz27 I think I've found another weird signature issue with re-mapping and CFR. I would assume it's related to https://github.com/DexPatcher/dexpatcher-tool/issues/33#issuecomment-610169079 but the solution you added for that one doesn't seem to fix this.

Lanchon commented 4 years ago

CFR with sanity checks should reject the siganture i think as there is no way for type erasure to go from the sig to the raw type. also --usesignatures false should fix all this... weird

Lanchon commented 4 years ago

hi @leibnitz27,

it seems --usesignatures false does not work as intended. i did not file a bug because i did not test myself, but @andrewleech did. he says he tested "a build from the current CFR master with the new switch to turn off signatures". do you want an issue in your tracker?

basically there is a zip in the OP here which has a patched.dex containing a bar class having an implements clause that mismatches its signature, and CFR does 2 things wrong:

here is a view of the problem bytecode...

image

thanks!