Closed Lukas-Dresel closed 6 years ago
Hi,
I am afraid I didn't fully understand the question. Some questions though:
assignblock
by hand at worst)Hey thanks for the quick response!
The original block without the optimization worked without a problem so it must be something that happened after the merging of the basic blocks. (Or at least symbolically executing the unmerged blocks independently from each other worked)
About the minimal example, I'm not quite sure, I can try minimizing things away until it no longer happens. I'm using 2 simplifications, one that emulates a block and replaces it with the symbol-assignments the symbolic execution returns and one that merges the basic blocs.
The problem occurs by first merging the basic blocks and then when the merged basic block is supposed to be replaced with the symbolic execution result.
The merger, heavily borrowed from bbl_simplifier:
_parent = MatchGraphJoker(restrict_in=False)
_son = MatchGraphJoker(restrict_out=False)
_expgraph = _parent >> _son
def _merge_ir_blocks(dg, graph):
"""Graph simplification merging ir_bloc with one and only one son with this
son if this son has one and only one parent"""
# Blocks to ignore, because they have been removed from the graph
to_ignore = set()
for match in _expgraph.match(graph):
# Get matching blocks
start_label, succ_label = match[_parent], match[_son]
start_bloc = graph._blocks[start_label]
succ_bloc = graph._blocks[succ_label]
# Ignore already deleted blocks
if (start_bloc in to_ignore or
succ_bloc in to_ignore):
continue
irdst = ExprId('IRDst')
for assign_block in start_bloc.irs:
if irdst in assign_block:
del assign_block[irdst]
start_bloc.irs += succ_bloc.irs
start_bloc.lines += succ_bloc.lines
start_bloc.invalidate_dst()
succ_bloc.irs = [AssignBlock([ExprAff(irdst, ExprInt(0, size=32))])]
succ_bloc.lines = []
succ_bloc.invalidate_dst()
# Merge block
for nextb in graph.successors_iter(succ_label):
graph.add_edge(start_label, nextb)
graph.del_node(succ_label)
to_ignore.add(succ_label)
ir_bbl_simplifier = DiGraphSimplifier()
ir_bbl_simplifier.enable_passes([_merge_ir_blocks])
The symbexec simplifier: (also borrowed mostly from somewhere else, don't remember)
symbols_init = {}
for r in ir_arch.arch.regs.all_regs_ids:
x = ExprId(r.name, r.size)
x.is_term = True
symbols_init[r] = x
sb = symbexec(ir_arch, dict(symbols_init))
sb.emulbloc(ir_bloc)
eqs = []
irdst = ExprId('IRDst')
eip = ExprId('EIP')
for n_w in sb.symbols:
v = sb.symbols[n_w]
if n_w in symbols_init and symbols_init[n_w] == v:
continue
if n_w == irdst or n_w == eip:
# We want the IRDst to be at the end of the block for easy viewing
continue
eqs.append(ExprAff(n_w, v))
# Finally, add the IRDst and EIP assignments
eqs.append(ExprAff(eip, sb.symbols[eip]))
eqs.append(ExprAff(irdst, sb.symbols[irdst]))
ir_bloc.irs = [AssignBlock(eqs)]
ir_bloc.lines = [None]
ir_bloc.invalidate_dst()
the invalidate_dst() call is a modification to miasm I made that simply clears out the dst and linenb_dst fields so that they must be recomputed, now that the destination has changed.
The merge simplifier is very, very hacky because of the way the IR graph is handled, but that is a separate issue.
I did not check on an example for now, just a few remark:
all_regs_ids_init
instead of the manually crafted symbols_init
?AssignBlock
are dict like, so not ordered and then I'm not sure the IRDst
and EIP
thing has an interested here (I'm trying to reduce your code to spot a minimalist example)_gen_graph
on the resulting IR (after simplification), it is resulting with the expected graph?As @serpilliere said, if you have a specific example of two blocks which cause the fail, it could help.
That is very interesting, the same happens with all_regs_ids_init when doing it like this because you need a dictionary, but it produces a different overlapping memory error.
symbols_init = {}
for i, r in enumerate(ir_arch.arch.regs.all_regs_ids):
symbols_init[r] = ir_arch.arch.regs.all_regs_ids[i]
The error can be reproduced with this script (edited for minimal size sample)
from miasm2.analysis.machine import Machine
from miasm2.core.asmbloc import asm_symbol_pool
from miasm2.expression.expression import ExprAff, ExprMem, ExprOp, ExprCond, ExprCompose, ExprSlice
from miasm2.expression.expression import ExprId
from miasm2.expression.expression import ExprInt
from miasm2.ir.ir import irbloc, AssignBlock
from miasm2.ir.symbexec import symbexec, symbols
irs = []
irs.append(AssignBlock([
ExprAff(ExprId('ESP', size=32), ExprOp('+', ExprId('ESP', size=32), ExprInt(0x20, 32))),
]))
irs.append(AssignBlock([
ExprAff(ExprMem(ExprOp('+', ExprId('ESP', size=32), ExprInt(0xfffffffc, 32)), size=32), ExprId('EBP', size=32)),
ExprAff(ExprMem(ExprOp('+', ExprId('ESP', size=32), ExprInt(0xffffffe0, 32)), size=32), ExprInt(0x81b8d2, 32)),
]))
symbol_pool = asm_symbol_pool()
irb = irbloc(symbol_pool.getby_name_create('label01'), irs)
machine = Machine("x86_32")
"""
def test_make_symbols_init():
all_regs_ids_init = [ExprId("%s" % x.name, x.size) for x in machine.mn.regs.all_regs_ids]
regs_init = {}
for i, r in enumerate(machine.mn.regs.all_regs_ids):
all_regs_ids_init[i].is_term = True
regs_init[r] = all_regs_ids_init[i]
return regs_init
"""
ir_arch = machine.ira(symbols())
symbols_init = {}
for i, r in enumerate(ir_arch.arch.regs.all_regs_ids):
symbols_init[r] = ir_arch.arch.regs.all_regs_ids[i]
#symbols_init = ir_arch.arch.regs.regs_init
#symbols_init = test_make_symbols_init()
symb = symbexec(ir_arch, symbols_init)
symb.emulbloc(irb)
# Modified elements
print 'Modified registers:'
symb.dump_id()
print 'Modified memory:'
symb.dump_mem()
The memory symbol dump shows the following incorrect memory access:
Modified memory:
@224[(ESP+0x24)] EBP[0:32]
@32[ESP] 0x81B8D2
When using ir_arch.arch.reg.regs_init we get
Modified memory:
@32[(ESP_init+0x1C)] EBP_init
@32[ESP_init] 0x81B8D2
However, the ESP_inits that appear in that case seem invalid because they are different symbols from the normal ones, e.g. the affectations assign to ESP and the blocks afterward take ESP_init again which is incorrect. It basically then ignores all previous assignblocks.
You may try to run the symbolic execution using ir_arch.arch.reg.regs_init
and at the end, replace init elements with their initial name (using .replace_expr
on the inverse dict of ir_arch.arch.reg.regs_init
).
What happens here is that initializing a symbolic execution engine using the same name for a key and its value could lead into weird and unexpected behavior. This is basically what you got when you init it the first way (imagine the following situation: EAX
= EAX
at the init. If I have EBX
= EAX
in the code, it will say "hey, I know a value for EAX
, and this is EAX
", and then again and again).
Here are some explanations, including the how and why it's built this way, to finish with an open question which may lead to a modification in Miasm's symbolic execution core engine.
Here is a recap.
Each AssignBlock
can be seen as a transfer function (in the mathematics meaning): It links output variables to input variables. So an AssignBlock
can be represented as:
(a', b', c', ...) = f(a, b, c, ...)
When you emulate an irs
(so multiple AssignBlock
) it will do something like:
(a', b', c', ...) = f(g(h(a, b, c, ...)))
with f, g, h tree AssignBlocks
.
For example, let say:
f(x) = x + 1
g(x) = x + 2
h(x) = x + 3
In math, f(g(h(x))) = (((x+3)+2)+1) (so x+6)
The symbolic execution in Miasm works this way: a dictionary links "variables" to their corresponding values. If we have for example EAX=1
, an entry in this dictionary is the EAX
key, and it's corresponding value is 1
. When we want to evaluate a variable, we are looking for it in this dictionary. If it's present, it becomes the associated value. If not, it remains the same. This is the main difference with math. The fact are that in math, if you want to apply f(x)=x+1
to y
, you will evaluate f(y) = (y) + 1
so y+1
. In math, you are done and you don't ask yourself if y
can be evaluated to something else. More explanations on this in the next comment
To do some symbolic computations, we will assign to the register EAX
the value EAX_init
. So back to the previous f, g, h. Let say we have the AssignBlocks:
f:
EAX = EAX+1
g:
EAX = EAX+2
h:
EAX = EAX+3
Those AssignBlocks
are the definitions of the equations. Our starting dictionary will be {EAX:EAX_init}
. Now, let apply the function h to our state. The equation is EAX = EAX+3
. We have to evaluate EAX
(and 3
, but which is obviously 3
). This gives EAX_init
. Note that if you try to evaluate EAX_init
, it will fail: it's not a key of the dictionary so it will answer EAX_init
. The resulting evaluation of EAX+3
on our current state {EAX:EAX_init}
gives EAX = EAX_init+3
.
As the block is over, we will update our current state (dictionary) and it becomes {EAX:EAX_init+3}
.
Now, if we apply the g function, it will do the same thing, except that EAX
will be resolved using the EAX_init+3
. So in the equation EAX = EAX+2
it will evaluate the EAX+2
as (EAX_init+3) + 2
. Which will be resolved as EAX_init+5
.
The last function will give the final state {EAX:EAX_init+6}
. Note that this is a state and not an equation. So if you want to get back the equivalent equation, which links EAX
to EAX
, you have to get back the first information which was that EAX_init
was the initial value of EAX
, so this state should give the equation EAX = EAX + 6
. See?
By the way, what is happening if you do not use the _init trick?
First attempt: you are starting with the dictionary {EAX:EAX}
. It's the case @commial described. When applying the equation EAX = EAX+1
it will try to resolve EAX
, which gives EAX
. It will then wonder if EAX
can be resolved and the dictionary will respond yes, it resolves as EAX
and so on... This makes an endless loop.
Nearly the same behavior is observed if we start using an empty dictionary {}
. The first lookup of EAX
won't resolve, so EAX
is used. After the addition is done, the state becomes {EAX:EAX+1}
. And here starts the loop at the second lookup.
This is why we used some times ago the attribute 'is_term'. In the original state, when we wrote {EAX:EAX}
, the first EAX
was not an is_term
, but its corresponding value had its is_term=True
. Some times (years?) later, we do not use the is_term
anymore because of some implementation details. It's not fully removed from Miasm yet, but it's on the way. It can be seen as regression but it's not (from our point of view). In fact, when you use the is_term
'trick', the state is in reality a hidden {EAX(!is_term):EAX(is_term)}
. So we decided to explicitly force it using a different symbol.
The fact that the algorithm keep on looking at in the dictionary to resolve variables is the main difference from the math point of view (and from the lambda calculus). Why?
When you are doing some *symbolic executions, you may want to add some information in order to have an nice result. Let's take an example: Here is a function:
push ebp
mov epb, esp
mov eax, dword ptr [ebp+8]
...
This translates to something like:
AssignBlock 0:
ESP = ESP-4
[ESP-4] = EBP
AssignBlock 1:
EBP = ESP
AssignBlock 2:
EAX = [EBP+8]
If we are doing some symbolic execution on it (using the init
) the resulting value of EAX
will be [ESP_init+4]
, which is the first argument of the function (x86, 32 bit, stdcall).
It's convenient to have the opportunity to add this information before the execution of this function to have the final equation expressed using this form.
So the starting state will be (for example) {[ESP_init+4]: arg0, [ESP_init+8]: arg1}
.
This information will be used during the evaluation of the last AssignBlock
in the following order:
1. Evaluation of [EBP+8] => first, evaluate EBP and 8
2. Evaluation of EBP => gives ESP_init-4
3. Back to the evaluation of [EBP+8] which gives [ESP_init+4]
4. Evaluation of [ESP_init+4] which gives arg0.
So we obtain EAX = arg0
. Quite convenient no?
Now, imagine we are in pure lambda calculus. How do you 'inject' this information ?
One of the solutions which comes in mind is to let the calculus operate, which leads to EAX = [ESP+4]
, and parse the expression tree to match and replace the targets.
Another one is to inject the information not in the state, but in the simplification engine.
Your questions and remarks have raised this problem and solutions, and we have started a discussion on modifying the core symbolic execution engine to operate in full lambda calculus.
The advantages: this may remove the use of any 'reg_init' or initial state (the "state" result will be 'directly' the transfer equation with the same variable space) . The code will be simpler.
The drawbacks: some code modifications will be needed on the "client side" in order to replace some feature offered by the actual mechanism, for example the evaluation of aliases which will be replaced by some code in the simplification engine (for example)
This feature will be discussed with @commial and some other Miasm's folks. We will keep you in touch.
Anyway, thank you for your remarks and questions: this may lead to an interesting improvement !
Hey, thank you very much for the detailed answer! I'm just getting into miasm so I'm still trying to wrap my head around a lot of stuff.
Can you elaborate on how you would think to inject the argument information directly in the simplification engine? I'm just thinking about the program i'm currently reverse engineering which does a bunch of bogus calls to then removing the return address and similar stuff and in this case having argument replacement hardcoded in the simplification engine would make things very confusing.
Replacing the patterns in the syntax tree seems to me to be the easier approach, as well as allowing more control to the user to turn it on and off or replace it with something completely different if he needs to. Couldn't the expression simplifiers already be used for this?
The simplification added may be to replace something like @32[ESP+X]
by argY.
If you are working on an obfuscated binary I must give you a warn on the IRA you have used:
There is a difference between the machine.ir()
and machine.ira()
For example, in the first version, the translation of a CALL
will give the side effects of the instruction:
loc_0000000000000000:0x00000000
EIP = loc_0000000011223344:0x11223344
ESP = {(ESP[0:32]+0xFFFFFFFC),0,32}
@32[(ESP[0:32]+0xFFFFFFFC)] = loc_0000000000000005:0x00000005
IRDst = loc_0000000011223344:0x11223344
We clearly see the stacking of the return address, and the jump to the target function. The second one is used to "model" the sub function, it will give something like:
lbl_gen_00000000:None
ESP = call_func_stack(loc_0000000011223344:0x11223344, ESP)
EAX = call_func_ret(loc_0000000011223344:0x11223344, ESP)
IRDst = loc_0000000000000005:0x00000005
In this case, the whole sub function at 0x11223344
has been modeled with only two side effects:
ESP
(for example the arguments clean)EAX
(return value)Note here, the modelisation is very poor and generic (using the strage and unknown operator call_func_ret
. The main goal of this feature is to give you a chance to inject a correct model of the function. For example:
lbl_gen_00000000:None
ESP = ESP + 0x4
EAX = ExprOp("malloc_0", @32[ESP])
IRDst = loc_0000000000000005:0x00000005
which may be used to do some code static analysis :smile:
In a sample I am currently working on I was messing around with different IR optimizations and came across some very weird behavior as a result of that that revolve around the overlapping memory calculations in the symbexec.py script.
As an example, below is one of the basic blocks that resulted after i tried something similar to the bbl_simplifier pass but on IR blocks.
When running this in the symbolic execution it raises assertions because it detects a 224 bit overlapping memory access somewhere that is assigned a 32 bit value, which obviously results in an exception.
I'm pretty sure it occurs in this assignBlock:
I think it detects an overlap between
@32[(ESP+0xFFFFFFE0)] = 0x81B8D2
and@32[(ESP)]
Then in symbexec.py get_mem_overlapping it calculates a pointer diff which comes out to be -4 and is then in line 362 cast to a signed integer and compared against the argument size. This however is a greater-than comparison with a negative number.
Then in substract_mems it calculates negative ptr_diffs and only checks the upper bound to not be larger than the lower bound, but in the end we get a memory_size of 224 bits which cannot be right.
Any ideas how this could be fixed? Or is my IR invalid in some way? Thanks for the help.
Full original IR Block: