Closed saruman9 closed 5 years ago
We don't require functions to be contiguous in memory, so we don't have the concept of a function 'end'.
Are you seeing a situation where disassembly occurs after a function that does not return (in which case applying __noreturn
to the target of the function call will solve your problem) or where jumptable targets have not all been identified correctly (in which case you can provide the correct target information through the API)?
Yes, I seeing a situation where lr
register modified then moved to pc
register, so next instruction for execution will not be next instruction after the function, you are right. I added __noreturn
property for the function for resolving jump problem but I think, that Binary Ninja should analyse cases when pc
register modified.
That sounds a lot like a jumptable that we're failing to identify, which we should actually be handling -- can you share some of the code around the problem?
Also, the ability to undefine functions that were automatically discovered should land in the very near future.
Yes of course. This is function epilogue:
0x41200190: ldr r0, data_412001ac
0x41200194: adr r1, data_41200338
0x41200198: add lr, r0, r1
0x4120019c: add lr, lr, r9
0x412001a0: mov r0, r5
0x412001a4: mov r1, r6
0x412001a8: mov pc, lr
I'm facing another problem, when works with this binary file: values of registers can not determined. Maybe this is the reason for analysis error? For example for the code above:
>>> current_function.get_reg_value_at(0x41200190, 'r0')
<undetermined>
>>> current_function.get_reg_value_at(0x41200194, 'r1')
<undetermined>
>>> current_function.get_reg_value_at(0x412001a8, 'lr')
<undetermined>
Also, the ability to undefine functions that were automatically discovered should land in the very near future.
Nice.
Use get_reg_value_after instead
You would want to be using get_reg_value_after
(_at
is for the value before the instruction executes).
There are several situations that could cause this: is the memory at data_412001ac
read-only? If not, we don't allow the value to be consumed when calculating possible values, and we'd be unable to solve for this.
If it's read-only, then chances are we weren't able to extract the possible values of r9
.
function.set_user_indirect_branches
would allow you to manually set the correct set of targets yourself and then you wouldn't need to use __noreturn
.
data_412001ac
read-only? we weren't able to extract the possible values ofr9
.
You were absolutely right! I added section with ReadOnlyCodeSectionSemantics
and value of r0
register was set. Value of r9
register computed and depends on function arguments. Can I somehow set indirect_branches
to undetermined
if I don't know correct set of targets?
So, I compute one possible target, which is independent of function arguments and apply set_user_indirect_branches
. Need to say, that I research firmware, which should have load offset. I was looking information about loading with offset, but nothing not found except #38. I create big file, which contained target firmware on a required offset, and when I set branch, the reanalysis start. Binary Ninja eat all my memory, because loaded file is too big for analysing. Can I somehow load firmware with offset in Binary Ninja?
We should still be able to solve r9
in most cases, so I'm curious as to why that jump table is not being solved for automatically. What are the reported values in get_possible_reg_values_after
for r9
? As an example, a solved jump table will usually look something like this:
Calling set_user_indirect_branches
with an empty list should clear the outgoing branches, which I assume is what you mean by undetermined
?
For loading the file at a given offset the answer is to implement a BinaryView
for now. That would allow you to set up the correct sections and mapping when the file is initially loaded and before any analysis occurs. The '#api-help' channel on our slack is a good place to get help with that, and there are some good examples available in public repositories:
https://github.com/Vector35/binaryninja-api/blob/dev/python/examples/nes.py#L520 https://github.com/joshwatson/binaryninja-microcorruption/blob/master/__init__.py
You can remove any function now as of build 1.1.1175-dev.
>>> current_function.get_low_level_il_at(0x412001a8).get_possible_reg_values_after('r9')
<undetermined>
But somewhere in the middle of function:
>>> current_function.get_low_level_il_at(0x41200120).get_possible_reg_values_after('r9')
<not in set([0x82400150])>
One of the possible paths for computing r9
register (r9
= 0):
<0x41200090: sub_41200090:>
<0x41200090: mov r4, r0>
<0x41200094: mov r5, r1>
<0x41200098: mov r6, r2>
<0x4120009c: mov sp, r4>
<0x412000a0: adr r0, data_41200150>
<0x412000a4: cmp r0, r6>
<0x412000a8: moveq r9, #0>
<0x412000ac: beq 0x4120014c>
<0x4120014c: ldr r0, data_41200044>
<0x41200150: ldr r1, data_4120004c>
<0x41200154: mov r4, r6>
<0x41200158: add r0, r0, r4>
<0x4120015c: add r1, r1, r4>
<0x41200160: mov r2, #0>
<0x41200164: cmp r0, r1>
<0x41200168: bhs 0x41200178>
<0x4120016c: str r2, [r0] {0x0}>
<0x41200170: add r0, r0, #0x4>
<0x41200174: b 0x41200164>
<0x41200178: mcr p15, #0, r0, c7, c5>
<0x4120017c: mcr p15, #0, r0, c7, c10, #0x4>
<0x41200180: mcr p15, #0, r0, c7, c5, #0x4>
<0x41200184: ldr r0, data_41200488>
<0x41200188: add r0, r0, r9>
<0x4120018c: mcr p15, #0, r0, c12, c0>
<0x41200190: ldr r0, data_412001ac>
<0x41200194: adr r1, data_41200338>
<0x41200198: add lr, r0, r1>
<0x4120019c: add lr, lr, r9>
<0x412001a0: mov r0, r5>
<0x412001a4: mov r1, r6>
<0x412001a8: mov pc, lr>
Calling
set_user_indirect_branches
... is what you mean byundetermined
?
No, I mean a situation, when I know, that indirect branch exist, but address for jump is unknown, because value of pc
register depends on function arguments, for example.
Thank you for detailed info about loading the file at a offset.
By the way, IDA does not recognize possible values of pc
(and accordingly most likely r9
) and doesn't recognize function to which control is transferred too.
The issue title of setting the "end" or "start" of a function will not be changed. It seems as though the underlying issue here has been fixed though.
Got here by the issue title. I've got into a situation where BN treats a main() as a continuation of startup code that jumps there, but I want to declare it as a separate function. What are my options there? Pressing P doesn't create anything, just gets me to a graph view of concatenated startup and main. Undefining a current function makes both main and startup continuations of another piece of startup with a jump. In IDA I would just set the end of startup at the jump, making the continuation a "code not belonging to any function" (or unexplored bytes), then define a new function there.
Try patching the jump to main into a call instruction or nop or anything else.If a call, then main should be a function, if something else, then you can hit p
on main. I think this is an edge case because of how the functions are created directly from the entry point but I could be wrong.
Uff, is it a BN's architecture limitation or just UI's? I'm totally ok with and idea of writing some script for a corner case, the question is what is possible and what is not.
Right now it's an architectural limitation that you don't tell a function where to start or stop. That's handled by an architecture plug-in's analysis and lifting.
That said, the exact situation you're describing seems off and related to just an entry point as I don't believe that is expected behavior.
I think there might be a simple solution to this.
start
. Right-click "Undefine Current Function"main
. Hit p
start
and hit p
This issue is that if its a direct jump then we don't see this as a separate function. If however main
is created first then we determine that its a tail call and everything will probably look right.
How I can change the end of function? Binary Ninja don't correct define end of ARM function. I tried undefine the function for later manual creating, but this function cannot be undefined.