jrgifford / androguard

Automatically exported from code.google.com/p/androguard
Apache License 2.0
3 stars 2 forks source link

wrong integer on dad decompiler #136

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. When a Resource is loaded, the id will be saved as integer in the dalvik 
file and loaded via a const vX, n opcode
2. pretty_show() does recognize the right value
3. when using source() with decompiler='dad' you will get as output 
1.76787404569e+38 instead of 2131034116

What version of the product are you using? On what operating system?
ubuntu, hg changeset 560:1e64...

Original issue reported on code.google.com by 5hp...@gmail.com on 29 Aug 2013 at 10:46

GoogleCodeExporter commented 9 years ago
Add the lower code into a file called patch.txt and then run:

$ patch -p0 opcode_ins.py patch.txt

Does this solve the problem? I don't know if this is a good fix for every kind 
of code...

And here is the patch code:

--- src/androguard/decompiler/dad/opcode_ins.py 2013-09-16 14:04:20.413244360 
+0200
+++ 
/home/knoppik/development/decompile/androguard/androguard/decompiler/dad/opcode_
ins.py  2013-08-20 10:39:14.309923617 +0200
@@ -251,7 +251,7 @@
 # const vAA, #+BBBBBBBB ( 8b, 32b )
 def const(ins, vmap):
     logger.debug('Const : %s', ins.get_output())
+    value = unpack("=i", pack("=i", ins.BBBBBBBB))[0]
-    value = unpack("=f", pack("=i", ins.BBBBBBBB))[0]
     cst = Constant(value, 'F', ins.BBBBBBBB)
     return assign_const(ins.AA, cst, vmap)

@@ -259,7 +259,7 @@
 # const/high16 vAA, #+BBBB0000 ( 8b, 16b )
 def consthigh16(ins, vmap):
     logger.debug('ConstHigh16 : %s', ins.get_output())
+    value = unpack('=i', '\x00\x00' + pack('=h', ins.BBBB))[0]
-    value = unpack('=f', '\x00\x00' + pack('=h', ins.BBBB))[0]
     cst = Constant(value, 'F', ins.BBBB)
     return assign_const(ins.AA, cst, vmap)

Original comment by mknoppi...@gmail.com on 16 Sep 2013 at 12:12

Attachments:

GoogleCodeExporter commented 9 years ago
My patch is just a quick n dirty fix, because the Dalvik opcodes "const" and 
"const/high16" are intended for loading float constants into registers. But 
obviously sometimes integer values are loaded with const/high16 within the 
dalvik code. I show you an example.

The dalvik code output of androguard is like this:

709-    1  (00000006) const/high16        v2, 32515

So there is an integer value but it is regarded as float by androguard because 
it regards values within const/high16 as floats.

But when I look in the same(!) code line with the dedexer there is a code 
difference:

0003: const/high16 v2, #int 2130903040 // #7f03

So originally in the .dex-code there is a meta information provided that tells 
const/high16 which value type has to loaded into the register. So it seems that 
androguard looses this information somewhere. Please fix that.

Thanks

Original comment by mknoppi...@gmail.com on 17 Sep 2013 at 8:33

GoogleCodeExporter commented 9 years ago
mh, i see... i checked the dalvik bytecode format documentation for that and it 
says some interesting things:
const/high16 vAA, #+BBBB0000    A: destination register (8 bits)
B: signed int (16 bits) 

so const/high16 is in any case a integer

but in the general description: Type-specific opcodes are suffixed with their 
type (or a straightforward abbreviation), one of: -boolean -byte -char -short 
-int -long -float -double -object -string -class -void.

but where is then const-double/high16? or any const-double?
how is this coded into the bytecode?

Original comment by 5hp...@gmail.com on 3 Oct 2013 at 10:05

GoogleCodeExporter commented 9 years ago
Hi,
this issue still persists in the current androguard version (using revision 489)
sometimes it decodes correctly like:
const v11 2131099649

sometimes it decodes wrong:
const/high16 v11 32518 (which should be const v11 2131099648)

sample to test:
apk: md5 da1a9a13503993729cacfa854a90e56f
dex: md5 e15070aec2e15e903e5611dcb70f1e33

Original comment by 5hp...@gmail.com on 11 Dec 2014 at 12:20

GoogleCodeExporter commented 9 years ago
okay i checked a little bit with IDA and found out that even IDA says thats 
this is a const/high16 call.
The Reason i think it does that is because the number that is loaded is 
2131099648 (0x7F060000) and therefore has all 16 low bits set to 0. When 
2131099649 (0x7F060001) is used later, it decompiles as const opcode.

I think the issue is that androguard just needs to shift the value there by 16 
bits.
The Documentation tells me:
> Move the given literal value (right-zero-extended to 32 bits) into the 
specified register.
i think that means that the actual value in the opcode is just 16 bit but is a 
32bit value (where only the high 16 bits are acutally provided).

I think a patch for this would look like this:
diff --git a/androguard/decompiler/dad/opcode_ins.py 
b/androguard/decompiler/dad/opcode_ins.py
index 1c5b4bd..9c8665a 100644
--- a/androguard/decompiler/dad/opcode_ins.py
+++ b/androguard/decompiler/dad/opcode_ins.py
@@ -258,8 +258,8 @@ def const(ins, vmap):
 # const/high16 vAA, #+BBBB0000 ( 8b, 16b )  def consthigh16(ins, vmap):
     logger.debug('ConstHigh16 : %s', ins.get_output())
-    value = unpack('=f', '\x00\x00' + pack('=h', ins.BBBB))[0]
-    cst = Constant(value, 'I', ins.BBBB)
+    value = unpack('=f', pack('=i', ins.BBBB<<16))[0]
+    cst = Constant(value, 'I', ins.BBBB<<16)
     return assign_const(ins.AA, cst, vmap)

Original comment by 5hp...@gmail.com on 11 Dec 2014 at 1:20