facebook / redex

A bytecode optimizer for Android apps
https://fbredex.com/
MIT License
6.03k stars 653 forks source link

Inconsistency issue: expected type REFERENCE, but found INT instead #424

Closed parind closed 2 years ago

parind commented 5 years ago

I am testing redex on an apk which is obfuscated with dead code. This tool adds dead code at binary dalvik bytecode level. The sample I am using can be found on virustotal using this SHA256 :- 9613a12ccb7cefd14370cbc5d23ee795d5e83a744639fcaa16ed4de35257e2f9 .

Here' the stdout

WARNING: Unable to find RequiresApi annotation. It's either unused (okay) or been deleted (not okay)
WARNING: No inliner config
ABORT! Inconsistency found in Dex code for Lc/e/a/p/a/AudioPlayerService$MediaButtonReceiver;.b:(Lc/e/a/p/a/AudioPlayerService;)V.
 at instruction 'SPUT_OBJECT v4, Lc/e/a/p/a/AudioPlayerService$MediaButtonReceiver;.a:Lc/e/a/p/a/AudioPlayerService;' @ 0x564e5f551580 for register v4: expected type REFERENCE, but found INT instead

Code:
[0x560cac047320] OPCODE: IOPCODE_LOAD_PARAM_OBJECT v4
[0x560cac047340] OPCODE: CONST v3, 1
[0x560cac047360] OPCODE: CONST_STRING "audio"
[0x560cac0475e0] OPCODE: IOPCODE_MOVE_RESULT_PSEUDO_OBJECT v0
[0x560cac047380] OPCODE: CONST v3, 1
[0x560cac0473a0] OPCODE: INVOKE_VIRTUAL v4, v0, Lc/e/a/p/a/AudioPlayerService;.getSystemService:(Ljava/lang/String;)Ljava/lang/Object;
[0x560cac0473c0] OPCODE: MOVE_RESULT_OBJECT v0
[0x560cac0473e0] OPCODE: CHECK_CAST v0, Landroid/media/AudioManager;
[0x560cac047600] OPCODE: IOPCODE_MOVE_RESULT_PSEUDO_OBJECT v0
[0x560cac047400] OPCODE: IF_NEZ v0
[0x560cac047420] OPCODE: RETURN_VOID 
[0x560cac047040] TARGET: SIMPLE 0x560cac047400
[0x560cac047440] OPCODE: NEW_INSTANCE Landroid/content/ComponentName;
[0x560cac047620] OPCODE: IOPCODE_MOVE_RESULT_PSEUDO_OBJECT v1
[0x560cac047460] OPCODE: CONST v3, 5
[0x560cac047480] OPCODE: CONST_CLASS Lc/e/a/p/a/AudioPlayerService$MediaButtonReceiver;
[0x560cac047640] OPCODE: IOPCODE_MOVE_RESULT_PSEUDO_OBJECT v2
[0x560cac0474a0] OPCODE: CONST v3, 7
[0x560cac0474c0] OPCODE: INVOKE_DIRECT v1, v4, v2, Landroid/content/ComponentName;.<init>:(Landroid/content/Context;Ljava/lang/Class;)V
[0x560cac0474e0] OPCODE: CONST v3, 5
[0x560cac047500] OPCODE: INVOKE_VIRTUAL v0, v1, Landroid/media/AudioManager;.unregisterMediaButtonEventReceiver:(Landroid/content/ComponentName;)V
[0x560cac047520] OPCODE: CONST v4, 0
[0x560cac047540] OPCODE: OR_INT v3, v3, v4
[0x560cac047560] OPCODE: SPUT_OBJECT v4, Lc/e/a/p/a/AudioPlayerService$MediaButtonReceiver;.a:Lc/e/a/p/a/AudioPlayerService;
[0x560cac047580] OPCODE: RETURN_VOID 

The opcode at 0x564e5f551560 is what seems to me causing the error.

The I have tried using LocalDcePass and RegAllocPass; also tried only LocalDcePass and polymorphic_constants = true, under ir_type_checker, but no help, So I guess needs some expertise fix. Also theres a lot of similar code everywhere, like or-int, div-int, mul-int.

justinjhendrick commented 5 years ago

I know you saw it already @parind, but for future readers, see #391 because it's similar and may be helpful.

justinjhendrick commented 5 years ago

If the code truly is dead, I wonder why LocalDce isn't removing it. Could you paste the entire method that redex is finding a type error inside of?

Also could you turn on logging with this bash environment variable, run redex again and show us the logs, please?

$ export TRACE=1
$ redex ...
parind commented 5 years ago

I have updated the OP, added the entire method. After using TRACE=1, these are the results. After this everything is same as in OP.

Using config /content/dead.config
Using binary (default)
Extracting apk...
Detected dex mode Api21DexMode
Unpacking dex files
Detecting Application Modules
Unpacking APK finished in 0.66 seconds
Running redex-all on 2 dex files 
Running redex binary at /tmp/redex.LABPyU/redex-all
Trace settings:
TRACEFILE=
SHOW_TIMESTAMPS=
SHOW_TRACEMODULE=
TRACE_METHOD_FILTER=
justinjhendrick commented 5 years ago

Thanks for the full code segment. It seems that all the write to v3 are indeed dead writes and LocalDce should be able to remove them.

Something is missing from these TRACE logs... It seems that only the logs from redex.py are printed here, but none of the logs from the c++ binary (named redex-all). Could you try debug and make sure that the environment variable TRACE is reaching redex-all? The issue could also be with printing to the terminal, you could try setting TRACEFILE to a filepath and see if the logs reach that file

What's in /content/dead.config? What are your full command-line arguments to redex? Are you passing a proguard config to redex?

parind commented 5 years ago

I tried using Tracefile too. It shows the same text from the TRACE=1, I tried before. Dead.config consists the LocalDcePass pass to redex, and polymorphic_constants set to true under ir_type_checker. I am only passing "-c" for the config and "-o" for setting an output file, as command line arguments. And No, I am not passing a proguard config. However, I am using a redex-all binary built by cmake (if that helps). Apart from that, I noticed a big file size difference between the redex-all binary built on cmake (16MB) and make (100mb+), Is this normal ?

justinjhendrick commented 5 years ago

LocalDce will refuse to run if redex isn't passed any proguard config: https://github.com/facebook/redex/blob/d605561bce42ef361e95a34019dea7fa98038f29/opt/local-dce/LocalDce.cpp#L324-L329

A while back, we added this check to a bunch of passes in order to make them safer. In some cases (like RemoveUnreachablePass) it's unsafe to run the pass without proguard rules (RemoveUnreachable deletes classes/methods/fields that aren't statically reachable, however Redex can't see reflection or JNI entry points). However, LocalDce is not like RemoveUnreachable in that way. I think LocalDce should be safe without a proguard config because it will do less optimizing without any proguard rules.

All this to say. You have two paths forward: 1) pass a proguard config into redex. Though the danger here is that you may enable other optimizations that you don't intend to (which may delete code that's reachable via ways that redex cannot see) 2) Delete this check in LocalDce on your copy of redex locally. I'm thinking about deleting that check in master too. Feel free to submit a pull request if you're interested in contributing

parind commented 5 years ago

Well, I will try them both. Is the "proguard-android.txt" a config ?

justinjhendrick commented 5 years ago

Yes, I think that's a proguard config, assuming you're referring to this file: https://android.googlesource.com/platform/sdk/+/android-4.1.2_r2/files/proguard-android.txt

or something similar

parind commented 5 years ago

I have tried both the methods. Works fine ! But redex jumbles the code in some classes, is this supposed to happen ? I will submit a pull request too, and also while using cmake, the build fails, so SET(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -pthread") has to be added to CMakeLists.txt.

justinjhendrick commented 4 years ago

Sorry for the long delay. I forgot to answer your questions:

But redex jumbles the code in some classes, is this supposed to happen ?

What exactly do you mean by jumble? Redex does rewrite the code to a more efficient form, which may seem jumbled to a human. Though the code should be semantically equivalent.

parind commented 4 years ago

I came to know about it later. Even though it optimizes the code in smali, it decompiles in Java for me, which is okay. I was confused then when I found some of the code in a method smali was placed 6 to 10 lines above / below.

parind commented 4 years ago

@justinjhendrick I am working on another project now, the project seems to be obfuscated by unknown tool, which basically implements a lot of dead code, and this tool also increments the registers/locals. Does redex have any passes which could bring them to their original register.

So far, I have tried only LocalDcePass, alongwith RegAllocPass.

justinjhendrick commented 4 years ago

Does redex have any passes which could bring them to their original register?

No, I don't think so. The information about the original registers is lost after that tool has obfuscated the program.

... basically implements a lot of dead code

LocalDce should handle this. Are you seeing LocalDce remove it?

this tool also increments the registers/locals

RegAlloc should re-number all of the registers but it doesn't not touch the debug info for the locals.

parind commented 4 years ago

LocalDce removes the dead code, so thats not a Problem. But RegAlloc replaces the register with another number, instead of subtracting 1 from it.

yuxuanchen1997 commented 4 years ago

Redex wasn't designed to be used as a deobfuscator. Also almost certainly, there's not one single deobfuscation strategy that fits all obfuscators. Redex's reg alloc is designed for maximized efficiency. You can write a pass to renumber the registers yourself, although I don't see why this is necessary.

NTillmann commented 2 years ago

The original problem seems to have been solved.