plum-umd / redexer

The Redexer binary instrumentation framework for Dalvik bytecode
http://www.cs.umd.edu/projects/PL/redexer/
Other
164 stars 33 forks source link

insert instructions bug #4

Open kuk0 opened 11 years ago

kuk0 commented 11 years ago

the insrt_insns_under_off function in modify.ml is buggy: you move the instruction which was originally at the cursor if the instruction was in a try/catch block, it will end up out of it! note that this bug is quite subtle: if there are multiple instructions in a try/catch block, the inserted instructions are included in the try/catch block (pointers to the start and end of the block don't change); the bug only shows up when there is only a single instruction in a try/catch block

fix: go through the try_item list and move the pointers...

btw. what is the motivation for insrt_insns_under_off in the logging module? why not to use the simple insrt_insns?

jsjeon commented 11 years ago

In order not to alter try/catch block, insrtinsns(under|over)_off in modify module 1) moves the instruction at the current cursor into a new place (lines 510--513 and 534--537) 2) overwrites the instruction at the current cursor with the 1st or last element of insns (lines 514--516 and 538--540) 3) then inserts the remaining insns as usual. In this manner, the absolute address at the current cursor is preserved as-is; if either start or end of a try/catch block refers to that address, these steps maintain that address for the try/catch block, while inserting instructions. Then, dump module will rearrange instructions in the code_item accordingly.

I believe comments above each function describe how bytecode snippets would look like before and after insertions. E.g., insrt_insns_under_off, as depicted, newly added insns will be placed at the offset of the current cursor, thus the instruction at the current cursor should be shifted below. Note that, if a try/catch block points to the current offset, it still points to that offset, and what the function does is actually pushing the given insns into that try/catch block.

In addition to addresses in try/catch blocks, you may notice that we don't need to alter any addresses in bytecode, e.g., goto instructions, if-test instructions, etc. In fact, inserting merely everything (class/method/field def. as well as instructions) doesn't require sophisticated editing. (Refer to new_class, new_field, and new_method; they just insert those definitions into the end of corresponding lists.) This is because dump module will rearrange everything according to the dex format.

The motivation for insrtinsns(under|over)_off is, as stated at comments in logging module (lines 525 and 570), not to alter control-flow of the method edited. Also, to log entrance and exit of certain APIs, we need both under and over schemes.

kuk0 commented 11 years ago

yes, almost... :) if you have multiple instructions in a try/catch block, like this:

try {
    C;
    D;
    E;
} catch ...

in dex it looks sth. like this:

(1) C    <----- start of try
(2) D
(3) E    <----- end of try

if you now insrt_insns_under_off instructions A; B, you end up with sth. like this (the numbers in parentheses give the order of instructions, not the offsets):

(1) A    <----- start of try
(4) D
(5) E    <----- end of try
(3) C
(2) B

which after reordering during the dex dump gives you the correct

try {
    A;
    B;
    C;
    D;
    E;
} catch ...

and indeed, we didn't have to move the start/end of try offsets and instructions A; B; get included in the try/catch block;

however, what happens, when you have only a single instruction in a try/catch block?

try {
    C;
} catch ...

in dex it looks sth. like this:

(1) C    <----- start of try; <----- end of try

after inserting instructions A; B; you get:

(1) A    <----- start of try; <----- end of try
(3) C
(2) B

that is:

try {
    A;
} catch ...
B;
C;

i.e. instruction C is kicked out of the try/catch block because you don't change the end of try offset

kmicinski commented 7 years ago

Hi @kuk0. I have observed and began to fix this issue in an experimental Redexer branch which I plan to merge soon. Thank you so much for taking the time to point it out--and sorry for not getting around to it before!

Let me point out one other complication with try/catch blocks:

it turns out that if you have a basic block in Dalvik that is within a try/catch block, adding instructions to that block can add control flow edges. The reason is because Dalvik--unlike the JVM--considers each individual instruction within a try block to either throw or not. If a basic block within a try cannot possibly throw an exception (defined in a table here: https://android.googlesource.com/platform/dalvik/+/kitkat-release/opcode-gen/bytecode.txt#86), it will not draw an edge between that block and the exception handler. This means that if you instrument code within a try-catch block, and you add an instruction that invokes a method (any method, since any method invocation is potentially considered to throw), you will possibly be changing the semantics of the program!

The fix for this is to--when you're inserting instructions into the middle of an exception handler, split the try block (intraprocedurally) down the middle: the instructions from the start of the try to before the ones you're inserting, and the ones after you're inserting to the end of that block.

kmicinski commented 7 years ago

I have fixed up the function @kuk0 points out in the optimized-logging branch now, and will merge in the next few weeks.