Open kuk0 opened 11 years ago
In order not to alter try/catch block, insrtinsns(under|over)_off in modify module 1) moves the instruction at the current cursor into a new place (lines 510--513 and 534--537) 2) overwrites the instruction at the current cursor with the 1st or last element of insns (lines 514--516 and 538--540) 3) then inserts the remaining insns as usual. In this manner, the absolute address at the current cursor is preserved as-is; if either start or end of a try/catch block refers to that address, these steps maintain that address for the try/catch block, while inserting instructions. Then, dump module will rearrange instructions in the code_item accordingly.
I believe comments above each function describe how bytecode snippets would look like before and after insertions. E.g., insrt_insns_under_off, as depicted, newly added insns will be placed at the offset of the current cursor, thus the instruction at the current cursor should be shifted below. Note that, if a try/catch block points to the current offset, it still points to that offset, and what the function does is actually pushing the given insns into that try/catch block.
In addition to addresses in try/catch blocks, you may notice that we don't need to alter any addresses in bytecode, e.g., goto instructions, if-test instructions, etc. In fact, inserting merely everything (class/method/field def. as well as instructions) doesn't require sophisticated editing. (Refer to new_class, new_field, and new_method; they just insert those definitions into the end of corresponding lists.) This is because dump module will rearrange everything according to the dex format.
The motivation for insrtinsns(under|over)_off is, as stated at comments in logging module (lines 525 and 570), not to alter control-flow of the method edited. Also, to log entrance and exit of certain APIs, we need both under and over schemes.
yes, almost... :) if you have multiple instructions in a try/catch block, like this:
try {
C;
D;
E;
} catch ...
in dex it looks sth. like this:
(1) C <----- start of try
(2) D
(3) E <----- end of try
if you now insrt_insns_under_off instructions A; B, you end up with sth. like this (the numbers in parentheses give the order of instructions, not the offsets):
(1) A <----- start of try
(4) D
(5) E <----- end of try
(3) C
(2) B
which after reordering during the dex dump gives you the correct
try {
A;
B;
C;
D;
E;
} catch ...
and indeed, we didn't have to move the start/end of try offsets and instructions A; B; get included in the try/catch block;
however, what happens, when you have only a single instruction in a try/catch block?
try {
C;
} catch ...
in dex it looks sth. like this:
(1) C <----- start of try; <----- end of try
after inserting instructions A; B; you get:
(1) A <----- start of try; <----- end of try
(3) C
(2) B
that is:
try {
A;
} catch ...
B;
C;
i.e. instruction C is kicked out of the try/catch block because you don't change the end of try offset
Hi @kuk0. I have observed and began to fix this issue in an experimental Redexer branch which I plan to merge soon. Thank you so much for taking the time to point it out--and sorry for not getting around to it before!
Let me point out one other complication with try/catch blocks:
it turns out that if you have a basic block in Dalvik that is within a try/catch block, adding instructions to that block can add control flow edges. The reason is because Dalvik--unlike the JVM--considers each individual instruction within a try block to either throw or not. If a basic block within a try cannot possibly throw an exception (defined in a table here: https://android.googlesource.com/platform/dalvik/+/kitkat-release/opcode-gen/bytecode.txt#86), it will not draw an edge between that block and the exception handler. This means that if you instrument code within a try-catch block, and you add an instruction that invokes a method (any method, since any method invocation is potentially considered to throw), you will possibly be changing the semantics of the program!
The fix for this is to--when you're inserting instructions into the middle of an exception handler, split the try
block (intraprocedurally) down the middle: the instructions from the start of the try to before the ones you're inserting, and the ones after you're inserting to the end of that block.
I have fixed up the function @kuk0 points out in the optimized-logging
branch now, and will merge in the next few weeks.
the insrt_insns_under_off function in modify.ml is buggy: you move the instruction which was originally at the cursor if the instruction was in a try/catch block, it will end up out of it! note that this bug is quite subtle: if there are multiple instructions in a try/catch block, the inserted instructions are included in the try/catch block (pointers to the start and end of the block don't change); the bug only shows up when there is only a single instruction in a try/catch block
fix: go through the try_item list and move the pointers...
btw. what is the motivation for insrt_insns_under_off in the logging module? why not to use the simple insrt_insns?