angr / tracer

Utilities for generating dynamic traces
BSD 2-Clause "Simplified" License
88 stars 28 forks source link

QEMU BB Tracing Addresses #64

Closed MostafaSoliman closed 2 years ago

MostafaSoliman commented 5 years ago

Hello, I was trying to add another module similar to qemu_runner that supports windows using DynamoRIO then I notice something strange when i tested the tracer component in driller, the output of shellphish-qemu trace show the below.

start_brk   0x0000000000000000
end_code    0x000000400000124d
start_code  0x0000004000001000
start_data  0x0000004000003de8
end_data    0x0000004000004038
start_stack 0x0000004000805060
brk         0x0000004000004040
entry       0x0000004000807210
Trace 0x56348d3065d0 [0000004000807210] 
Trace 0x56348d306630 [0000004000807e50] 
Trace 0x56348d306800 [0000004000807ea5] 

note that the entry BB address is 0000004000807210 with offset 0x806210 this is not the same when i try to load the binary by angr it shows that the entry BB offset is 0x01060 as shown.

In [6]: hex(p.loader.main_object.entry)                                                                                                                                                                            
Out[6]: '0x401060'

The only explanation is see is that QEMU is tracing all BB execution starting from the linux libs that are executed before calling the binary entry point, and what confirms that, is that i can see the entry offset down in the trace addresses

Trace 0x56348d340b20 [00000040008151db] 
Trace 0x56348d340be0 [000000400080724a] 
Trace 0x56348d340c30 [0000004000001060] _start
Trace 0x56348d340d00 [0000004000877a30] 
Trace 0x56348d340e20 [0000004000877a65] 
Trace 0x56348d340ec0 [0000004000877a70] 

If this is the case then i don't understand why we consider that the start_code address 0x0000004000001000 in the QEMU output is the new binary base address while it should be 0x0000004000000000 in

https://github.com/angr/tracer/blob/master/tracer/qemu_runner.py#L415 
rhelmot commented 5 years ago

To be honest, I'm not sure what the start_code address is actually used for. It is a very recent change that our qemu fork will dump the basic block addresses from libraries as well as the main binary. Here's the diff for that: https://github.com/shellphish/shellphish-qemu/commit/4aee5c2b54cdc33d7f7def8d899aaa9359b94cb2

The tracer_code_start may be the code start in the dump. I'm not quite sure how this is calculated. Maybe from section headers?

MostafaSoliman commented 5 years ago

Yes tracer_code_start is the start_code value from qemu trace file and from load_elf_image function at elfload.c in qemu 2.10.0 this value is (stored in info->start_code variable) calculated from the header section, i think one need to debug qemu to see why it is saying that the binary start address is 0x0000004000001000 and not 0x0000004000000000. I will give it a try when i have lower load. I am adding the app i am tracing in case someone would like to try test.zip

github-actions[bot] commented 2 years ago

This issue has been marked as stale because it has no recent activity. Please comment or add the pinned tag to prevent this issue from being closed.

github-actions[bot] commented 2 years ago

This issue has been closed due to inactivity.