Granary / granary

Dynamic binary translation framework for instrumenting the Linux kernel and its modules
Other
76 stars 6 forks source link

[RFC]About granary API Document. #20

Closed renzhengeek closed 10 years ago

renzhengeek commented 10 years ago

Hi peter, So far, granary has not any API Document and a clear description of the interface between DynamoRIO and granary(how the both combined together). Lacking API documents make it so difficult to use the granary framework.

So,do you plan to do this work in future? I'd like to contribute to this work,but I'm just a newer here.

cheers up :)

pgoodman commented 10 years ago

There is no such document, unfortunately. For the time being I have no plans to make a document as I'm already working on the next version of Granary (very early stages, major differences in design).

Much of DynamoRIO's instruction manipulation API is maintained, as Granary internally uses DynamoRIO's instruction representation. Just about everything else is different though.

A major thing lacking is a high-level "how everything fits together" picture. This is exacerbated by how instrumentation clients are linked in with Granary.

For example, in the following from the null client:

    /// Instrucment a basic block.

    granary::instrumentation_policy null_policy::visit_app_instructions(
        granary::cpu_state_handle cpu,
        granary::basic_block_state &bb,
        granary::instruction_list &ls
    ) throw() {

        return granary::policy_for<null_policy>();
    }

The visit_app_instructions and visit_host_instructions methods on a "policy" class are functionally similar to DynamoRIO's BB instrumentation events. visit_app_instructions is invoked on module basic blocks, and visit_host_instructions is invoked on kernel basic blocks.

cpu allows your tool to access Granary-specific CPU-private state. Your client can extend Granary's CPU-private state by defining a struct cpu_state in the client namespace, doing a #define of CLIENT_cpu_state, and adding an client-specific #include of the file containing your struct cpu_state in granary/clients/state.h.

bb is like cpu, but it represents basic-block specific state. For example, if you wanted to count the number of executions of each basic block, and wanted Granary to be responsible for allocating the counter itself, then you'd put it into your own struct basic_block_state within the client namespace. Then #define CLIENT_basic_block_state, and do the same #include thing.

ls is a list of instructions. Some of the clients have examples of how to iterate over instructions, and inject new instructions. This is functionally similar to DynamoRIO's instrlist_t data structure. You can see examples of inserting instructions and call-outs here: https://github.com/Granary/granary/blob/master/clients/cfg/instrument.cc#L120

Now, you might be wondering "wtf is this policy stuff?". For the time being, my suggestion is to completely ignore it, but when you want to start something, just duplicate the clients/null folder, and search & replace every null with your own thing, e.g. shadow.

So, if you just copy & paste a lot of what is there in the null client, then it's a great starting point. There are other things in the null client's instrument.cc. namely: null_policy::handle_interrupt and handle_kernel_interrupt.

The first is a method that Granary invokes when code instrumented by the null policy is interrupted. The second is invoked by Granary when native kernel code is interrupted. These should only be defined / used if CONFIG_FEATURE_CLIENT_HANDLE_INTERRUPT is defined as 1 in granary/globals.h.

The way "everything" fits together is rather unusual, but some good starting points are to search the Granary source code for uses of the macro GRANARY_ENTRYPOINT, which document the entrypoint functions / methods into Granary from the code cache. A major difference between Granary and DynamoRIO is that there is no dispatcher per-se. The method code_cache::find( granary/code_cache.cc) generally handles this role, but you will notice that every GRANARY_ENTRYPOINT-marked method always returns, and doesn't do the sort of coroutine call_switch_stack as DynamoRIO does to return execution back to the code cache.

In Granary, execution transfers from the code cache onto a CPU-private stack, and then when on the CPU-private stack, some internal code calls into Granary's code_cache::find method. This can sort of be seen in granary/dbl.cc which is used to "mangle" direct branches into hot-code patchable direct branches, that start by branching into "edge code" that eventually invokes code_cache::find, then patches the direct branch to jump to its intended target elsewhere in the code cache.

Here's a walk through of how execution leaves the code cache and makes it to one of the instrumentation functions:

  1. This code https://github.com/Granary/granary/blob/master/granary/dbl.cc#L173 is targeted by some edge code, switches stacks, then calls the patch_instruction entrypoint function:
  2. (https://github.com/Granary/granary/blob/master/granary/dbl.cc#L63) which does a lookup in the code cache index by calling code_cache::find:
  3. (https://github.com/Granary/granary/blob/master/granary/dbl.cc#L129), which does a lookup in the code cache hash table: 4. https://github.com/Granary/granary/blob/master/granary/code_cache.cc#L109, and if this (and a few other lookups) fail, goes and builds the basic block: 5. https://github.com/Granary/granary/blob/master/granary/code_cache.cc#L204, which invokes the basic block decoder/translator: 6. https://github.com/Granary/granary/blob/master/granary/basic_block.cc#L1089, which decode and instruments one or more basic blocks according to an "instrumentation policy" (e.g. null_policy): 7. https://github.com/Granary/granary/blob/master/granary/basic_block.cc#L765, which invokes the policy's instrument method, that chooses one of visit_app_instructions or visit_host_instructions (based on module or kernel code):
  4. https://github.com/Granary/granary/blob/master/granary/policy.h#L208, and that leads to something like null_policy::visit_app_instructionsbeing invoked to manipulate the instructions.

Hopefully this will give you some more insight into the way things connect together. It is by no means well structured or engineered, and I apologize for that. Let me know if this clears up any questions or serves to raise more.

Best Regards,

Peter Goodman, http://www.petergoodman.me 65 High Park Ave., Toronto, Ontario M6P 2R7

On 10 March 2014 21:11, Ren Zhen notifications@github.com wrote:

Hi peter, So far, granary has not any API Document and a clear description of the interface between DynamoRIO and granary(how the both combined together). Lacking API documents make it so difficult to use the granary framework.

So,do you plan to do this work in future? I'd like to contribute to this work,but I'm just a newer here.

cheers up :)

Reply to this email directly or view it on GitHubhttps://github.com/Granary/granary/issues/20 .

renzhengeek commented 10 years ago

Hi Peter, Thanks very much for your detail guide. Your guide give me a better insight how Granay works, though it will me some time to digest clearly. In deed, during this time, I am keeping reading source and some relative paper in order to understand granary.So,the guide is valuable :) I want to implement show memory just on kernel module, to enhance ability of memory error detection.

I would like to prepare myself to do possisble contributions for this project. In the future, if there is some easy or trivial tasks, I think it maybe the startpoint for me to invovled in,hopefully.

All the best for you. cheers for granary+.

renzhengeek commented 10 years ago

Hi Peter, In granary/clients/watchpoints/clients/bounds_checker/instrument.cc file,func 'visit_overflow' : ... IF_USER( printf("Access of size %u to %p in basic block %p overflowed\n", size, unwatched_addr, *return_address_in_bb); ) ....

it seems like visit_overflow cannot works in kernel space,isn't? All best for you!

pgoodman commented 10 years ago

Registers are indeed CPU-private. In the Linux kernel, you can also have CPU-private memory. It works via indirection through the %gs segment register. Using CPU-private memory is (generally) only safe when interrupts are disabled. It can be a convenience to store some things in CPU private memory (such as a free list, as in the bounds checker tool) to avoid contention on a lock.

A CPU-private stack is just a stack that has been allocated in CPU-private memory. Granary's internals all execute on such a stack.

Watchpoints are a bit different, in that the visit_app_instructions and visit_host_instructions methods are implemented by the base watchpoints template class ( https://github.com/Granary/granary/blob/master/clients/watchpoints/instrument.h#L750) and they dispatch to finer granularity instrumentation methods visit_read and visit_write ( https://github.com/Granary/granary/blob/master/clients/watchpoints/instrument.h#L730), which you will find in the instrument.cc of the bounds_checker tool. These two methods are used to instrument individual memory reads/writes.

The bounds checker tool uses a bit of custom assembly (not included in instrument.cc) that adds some extra indirection between the instrumentation added by visit_read/visit_write. The step-by-step is:

  1. If the address is a watched address, then the visit_read/visit_write methods inject a function call into the code. The tracker.labels[i] is an instruction label, where if control reaches this label, then we know that:
    1. The memory read/write is accessing a watched address.
    2. The watched address is stored in tracker.regs[i].
    3. The amount of memory being read/written by the memory operation is tracker.sizes[i].
    4. tracker.ops[i] is a direct reference into the memory operation in the instruction. It is either a memory read, a memory write, or a read/write operation (e.g. compare and swap)
  2. The call is introduced here: https://github.com/Granary/granary/blob/master/clients/watchpoints/clients/bounds_checker/instrument.cc#L205
  3. It introduces a call to a function, where the function used is specific to a register and a memory size. The array of function pointers is defined here: https://github.com/Granary/granary/blob/master/clients/watchpoints/clients/bounds_checker/instrument.cc#L54
  4. The individual functions stored in the array are defined here: https://github.com/Granary/granary/blob/master/clients/watchpoints/clients/bounds_checker/bounds_checkers.asm#L69
  5. The call to client::wp::visit_overflow is here: https://github.com/Granary/granary/blob/master/clients/watchpoints/clients/bounds_checker/bounds_checkers.asm#L39 (see the #define here and check the symbol with c++filt https://github.com/Granary/granary/blob/master/clients/watchpoints/clients/bounds_checker/bounds_checkers.asm#L18 )

So, visit_overflow should work in kernel space, but it doesn't really do anything for the time being :-P

Best Regards,

Peter Goodman, http://www.petergoodman.me 65 High Park Ave., Toronto, Ontario M6P 2R7

On 17 March 2014 09:07, Ren Zhen notifications@github.com wrote:

Hi Peter, In granary/clients/watchpoints/clients/bounds_checker/instrument.cc file,func 'visit_overflow' : ... IF_USER( printf("Access of size %u to %p in basic block %p overflowed\n", size, unwatched_addr, *return_address_in_bb); ) ....

it seems like visit_overflow cannot works in kernel space,isn't?

Reply to this email directly or view it on GitHubhttps://github.com/Granary/granary/issues/20#issuecomment-37812689 .

renzhengeek commented 10 years ago

Thanks. I now get more better understanding of it. 5. 1)where is the symbol come from?(do readelf at granary.ko?) are we need to modify this symbol?

define VISIT_OVERFLOW SYMBOL(_ZN6client2wp14visit_overflowEmjPPh)

2) but why does 'visit_overflow ' not do anything for the time being? you mean it do nothing for kernel space now, cause of it only use IF_USER? or else reason? if I want it work for kernel space, just add IF_KERNEL(printk...)-like in it?

Best Regards.

pgoodman commented 10 years ago

1) The symbol is defined in instrument.cc as the visit_overflow function.

2) At the time of making this, I didn't need it to actually do anything :-P I was testing that it was called by putting a breakpoint on it. In practice, it could log the error to somewhere.

renzhengeek commented 10 years ago

what's meaning of '@N@'? it seems strange for me.

define DECLARE_FUNC(x) \

.align 16 @N@\ .globl SYMBOL(x) @N@\ .type SYMBOL(x), @function@N@\

thanks ;)

pgoodman commented 10 years ago

Oh it's sort of a hack to help generate readable assembly in the bin/ folder. Assembly files are first pre-processed, then compiled. Because macros are used a lot, to get more readable output for debugging, there's the @N@ stuff. It's something that DynamoRIO did and at the time I liked the idea, so I ported it over.

renzhengeek commented 10 years ago

Hi peter, I want to use printk to see some info when using bounds_checker, so I modify the func visit_overflow,like this: ... IF_KERNEL( printk("Access of size %u to %p in basic block %p underflowed\n", ... but compiler says: ' clients/watchpoints/clients/bounds_checker/instrument.cc:240:60: error: ‘printk’ was not declared in this scope size, unwatched_addr, *return_address_in_bb); ) '

how to handle it?

best regards!

pgoodman commented 10 years ago

Don't use the kernel's printk!!!! First, it won't be resolved to a symbol. If you really need access to the symbol, you can usually use types::printk or KERNEL_ADDR_printk (or some variant thereof, #define'd in granary/gen/kernel_detach.inc).

The reason not to use printk is one of re-entrancy. That is, suppose printk is being instrumented, and then your instrumentation code invokes printk. What should happen?

This is one sore point in the kernel: I have not set up any useful on-demand logging infrastructure. There is the reporting mechanism, client::report, but that is batch based and depends on kernel-provided RelayFS.

renzhengeek commented 10 years ago

Oh! I see. could you show me how to use the RelayFS to report 'bounds checker'?

renzhengeek commented 10 years ago

Now,gdb can show the corrent location of visit_overflow: ... Breakpoint 6 at 0xffffffffa024b760: file clients/watchpoints/clients/bounds_checker/instrument.cc, line 229. ...

I also set breakpoint visit_read and visit_write, and it can hit both breakpoint when I insmod my test module. (1)but cannnot hit visit overflow, though my test module indeed exists overfow err. (2)when hit some breakpoint, I want to print variables value,but it offten says "No symbol "***" in current context."like this: ... Breakpoint 9, client::wp::bound_policy::visit_write(granary::basic_block_state&, granary::instruction_list&, client::wp::watchpoint_tracker&, unsigned int) () at clients/watchpoints/clients/bounds_checker/instrument.cc:218 218 ) throw() { (gdb) n 219 if(!(SOURCE_OPERAND & tracker.ops[i].kind)) { (gdb) 218 ) throw() { (gdb) 219 if(!(SOURCE_OPERAND & tracker.ops[i].kind)) { (gdb) 199 const unsigned reg_index = register_to_index(tracker.regs[i].value.reg); (gdb) p reg_index No symbol "reg_index" in current context. .....

another example, tracker is struct type,how to watch its value.here is what I got: ... (gdb) p tracker No symbol "tracker" in current context. ....

why?and how to handle it?

pgoodman commented 10 years ago

You can use relayfs via granary::log, but only within your own client::report function.

pgoodman commented 10 years ago

Okay, we're going to start in on a debugging adventure.

First things first, make sure both CONFIG_DEBUG_TRACE_EXECUTION, CONFIG_DEBUG_TRACE_RECORD_REGS, and CONFIG_DEBUG_ASSERTIONS are #defined as 1 in granary/globals.h.

Make a very small module that does just a single kmalloc, then accesses the memory in a way that overflows. Only do this once, nothing else (no kfree!!!), and do this in the module's init function. We just want to keep things very simple.

Compile Granary against your kernel with the bounds_checker module. Load Granary into the kernel. Attach GDB, continue without adding breakpoints, load your module. Okay, your module has run its init function. Now we want to go and look at the code that Granary made.

Do Ctrl-C in GDB to interrupt the kernel.

First, do p-wrapper granary::DETACH_ID_kmalloc. That should then give you two addresses: the app and host wrappers. Disassemble the app wrappers address. Copy and paste all this output into your reply.

Then to p-trace 20. This will give you a list of the <= 20 most recently executed basic blocks. If your module is small then it should ideally be only a few basic blocks. Lets say the output is: [1] 0xffffA .... [2] 0xffffB .... [3] 0xffffC ...

Then do:

p-trace-entry-regs 3
p-trace-entry-bb 3

p-trace-entry-regs 2
p-trace-entry-bb 2

p-trace-entry-regs 1
p-trace-entry-bb 1

Copy and paste all of this output here.

Finally, do:

p client::wp::NEXT_COUNTER_INDEX

This will (hopefully) print out the structure for an std::atomic variable. Somewhere in there you should see an integer. If the integer is 1, then do:

p client::wp::DESCRIPTORS[0]

If the integer is 2, then do:

p client::wp::DESCRIPTORS[0]
p client::wp::DESCRIPTORS[1]

Etc.

Copy and past all of this output to a reply :-D

Hopefully this will let me really deeply see what's going on :D

pgoodman commented 10 years ago

Note: Make sure to load and initialize Granary before loading your module, then load your module, then go through the various GDB steps.

renzhengeek commented 10 years ago

(1)starting gdb,there are warnings: ... warning: section .strtab not found in /home/renzhen/granary/bin/granary.ko warning: section .symtab not found in /home/renzhen/granary/bin/granary.ko Function "__cxa_throw" not defined. .. (2)continue gdb without breakpoint,initialize Granry, then load my module,dmesg says: [ 223.066060] [granary] Loading Granary... [ 223.066064] [granary] Stack size is 32768 [ 223.066065] [granary] Running initialisers... [ 223.066072] [granary] Done running initialisers. [ 223.066073] [granary] Registering module notifier... [ 223.066074] [granary] Registering 'granary' device... [ 223.072651] [granary] Registered 'granary' device. [ 223.074959] [granary] Relay channel initialised. [ 223.074962] [granary] Done; waiting for command to initialise Granary. [ 223.074965] [granary] Notified of module 0xffffffffa07558a0 [.text = ffffffffa012f000] [ 223.074966] [granary] Module's name is: granary. [ 223.074968] [granary] Ignoring module state change. [ 275.434534] Clocksource tsc unstable (delta = 17715242185 ns) [ 275.444745] Switching to clocksource hpet [ 295.899321] [granary] Initialising Granary... [ 295.940713] [granary] Initialised. [ 323.441249] [granary] Notified of module 0xffffffffa0109000 [.text = ffffffffa0107000] [ 323.441254] [granary] Module's name is: testmod. [ 323.441297] [granary] Got internal representation for module. [ 323.441301] [granary] Notifying Granary of the module... [ 323.441303] [granary] Notified about module (testmod) state change: COMING. [ 323.441559] [granary] Notified Granary of the module. [ 323.442832] buf: This is a test module to test bounds checher functionality. [ 323.442832] [ 323.442838] [granary] Notified of module 0xffffffffa0109000 [.text = ffffffffa0107000] [ 323.442840] [granary] Module's name is: testmod. [ 323.442842] [granary] Got internal representation for module. [ 323.442843] [granary] Notifying Granary of the module... [ 323.442845] [granary] Notified about module (testmod) state change: LIVE. [ 323.442846] [granary] Notified Granary of the module. (3)p-wrapper,gdb says: (gdb) p-wrapper granary::DETACH_ID_kmalloc Identifier "DETACH_ID_kmalloc" does not exists in namepace "granary".

pgoodman commented 10 years ago

What about:

p/d granary::DETACH_ID___kmalloc

(search for kmalloc in granary/gen/kernel_detach.inc for variations of the function name, and try appending those variations to granary::DETACH_ID_).

If you eventually hit on one of these, then when you do the p-wrapper of it, also add a breakpoint on the app wrapper address to see if it gets hit when your module is initialized (only load your module after initializing Granary!).

If any of this works, please try to proceed with the other steps!

pgoodman commented 10 years ago

The way to add a breakpoint to an address is b *0xfffff.... Sent from my mobile device. From: Ren ZhenSent: Saturday, March 22, 2014 9:13 AMTo: Granary/granaryReply To: Granary/granaryCc: Peter GoodmanSubject: Re: [granary] [RFC]About granary API Document. (#20)It is a spelling mistake.Now p-wrapper works,but fail to add a breakpoint on the app wrapper address. gdb says: ... (gdb) p-wrapper granary::DETACH_ID_kmalloc Function wrapper kmalloc (5226): Original address: 0xffffffff811129b0 App Wrapper address: 0xffffffffa01f27b0 Host Wrapper address: (nil) (gdb) p/d granary::DETACH_ID___kmalloc $2 = 5226 (gdb) b 0xffffffffa01f27b0 Function "0xffffffffa01f27b0" not defined. Breakpoint 2 (0xffffffffa01f27b0) pending. ...

All best for you!

—Reply to this email directly or view it on GitHub.

renzhengeek commented 10 years ago

Now, (1)gdb part shows: { (gdb) p/d granary::DETACH_ID_kmalloc $1 = 5226 (gdb) p-wrapper granary::DETACHIDkmalloc Function wrapper __kmalloc (5226): Original address: 0xffffffff811129b0 App Wrapper address: 0xffffffffa01787b0 Host Wrapper address: (nil) (gdb) b *0xffffffffa01787b0 Breakpoint 2 at 0xffffffffa01787b0: file /home/renzhen/granary/clients/watchpoints/clients/bounds_checker/kernel/linux/wrappers.h, line 28. (gdb) c Continuing. }

(2)what dmesg says: { [ 714.145807] [granary] Loading Granary... [ 714.145811] [granary] Stack size is 32768 [ 714.145812] [granary] Running initialisers... [ 714.145819] [granary] Done running initialisers. [ 714.145820] [granary] Registering module notifier... [ 714.145821] [granary] Registering 'granary' device... [ 714.146069] [granary] Registered 'granary' device. [ 714.148612] [granary] Relay channel initialised. [ 714.148614] [granary] Done; waiting for command to initialise Granary. [ 714.148617] [granary] Notified of module 0xffffffffa07548a0 [.text = ffffffffa012e000] [ 714.148618] [granary] Module's name is: granary. [ 714.148620] [granary] Ignoring module state change. [ 735.624552] Clocksource tsc unstable (delta = 7612561604 ns) [ 735.625780] Switching to clocksource hpet [ 797.009757] [granary] Initialising Granary... [ 797.016040] [granary] Initialised. [ 815.887276] No module found in object [ 819.327976] [granary] Notified of module 0xffffffffa0128000 [.text = ffffffffa0126000] [ 819.327980] [granary] Module's name is: testmod. [ 819.329747] [granary] Got internal representation for module. [ 819.329756] [granary] Notifying Granary of the module... [ 819.329758] [granary] Notified about module (testmod) state change: COMING. [ 819.329886] [granary] Notified Granary of the module. [ 819.335406] buf: This is a test module to test bounds checher functionality. [ 819.335406] [ 819.335410] [granary] Notified of module 0xffffffffa0128000 [.text = ffffffffa0126000] [ 819.335411] [granary] Module's name is: testmod. [ 819.335412] [granary] Got internal representation for module. [ 819.335413] [granary] Notifying Granary of the module... [ 819.335414] [granary] Notified about module (testmod) state change: LIVE. [ 819.335415] [granary] Notified Granary of the module. }

(3)my test module source code: {

include <linux/init.h>

include <linux/module.h>

include <linux/kernel.h>

include <linux/slab.h>

include <linux/string.h>

MODULE_LICENSE("Dual BSD/GPL");

static char buf; static int testmod_init(void) { buf = (char )kmalloc(sizeof(char) * 10, GFP_KERNEL); strcpy(buf, "This is a test module to test bounds checher functionality.\n"); //buf overfow
printk("buf: %s\n", buf); return 0; }

static void testmod_exit(void) { printk("goodbye,kernel\n"); }

module_init(testmod_init); module_exit(testmod_exit); }

(4)And The __kmalloc wrapper not be hit.

(5)disassemble the app wrapper address: { (gdb) disassemble 0xffffffffa01787b0 Dump of assembler code for function _ZN7granary21wrapped_function_implILNS_19function_wrapper_idE5226ELNS_15runtime_contextE0ELb0ENS_23custom_wrapped_functionEJEE5applyEmj: 0xffffffffa01787b0 <+0>: push %rbp 0xffffffffa01787b1 <+1>: push %rbx 0xffffffffa01787b2 <+2>: mov %rdi,%rbx 0xffffffffa01787b5 <+5>: sub $0x28,%rsp 0xffffffffa01787b9 <+9>: cmpb $0x0,0x2413b58(%rip) # 0xffffffffa258c318 0xffffffffa01787c0 <+16>: mov 0x38(%rsp),%rbp 0xffffffffa01787c5 <+21>: je 0xffffffffa0178810 <_ZN7granary21wrapped_function_implILNS_19function_wrapper_idE5226ELNS_15runtime_contextE0ELb0ENS_23custom_wrapped_functionEJEE5applyEmj+96> 0xffffffffa01787c7 <+23>: mov $0xffffffff811129b0,%rax 0xffffffffa01787ce <+30>: mov %rbx,%rdi 0xffffffffa01787d1 <+33>: callq *%rax 0xffffffffa01787d3 <+35>: movabs $0x800000000000,%rdx 0xffffffffa01787dd <+45>: test %rdx,%rax 0xffffffffa01787e0 <+48>: mov %rax,0x18(%rsp) 0xffffffffa01787e5 <+53>: je 0xffffffffa01787ff <_ZN7granary21wrapped_function_implILNS_19function_wrapper_idE5226ELNS_15runtime_contextE0ELb0ENS_23custom_wrapped_functionEJEE5applyEmj+79> 0xffffffffa01787e7 <+55>: mov %rbp,%rcx 0xffffffffa01787ea <+58>: mov %rbx,%rdx 0xffffffffa01787ed <+61>: mov %rax,%rsi 0xffffffffa01787f0 <+64>: lea 0x18(%rsp),%rdi 0xffffffffa01787f5 <+69>: callq 0xffffffffa0178690 <_ZN6client2wp14add_watchpointIPvJS2_mS2_EEENS0_21add_watchpoint_statusERT_DpT0_> 0xffffffffa01787fa <+74>: mov 0x18(%rsp),%rax 0xffffffffa01787ff <+79>: add $0x28,%rsp 0xffffffffa0178803 <+83>: pop %rbx 0xffffffffa0178804 <+84>: pop %rbp 0xffffffffa0178805 <+85>: retq
0xffffffffa0178806 <+86>: nopw %cs:0x0(%rax,%rax,1) 0xffffffffa0178810 <+96>: mov $0xffffffff811129b0,%rdi 0xffffffffa0178817 <+103>: mov %esi,0xc(%rsp) 0xffffffffa017881b <+107>: movb $0x1,0x2413af6(%rip) # 0xffffffffa258c318 0xffffffffa0178822 <+114>: callq 0xffffffffa0178630 <_ZN7granary18dynamic_wrapper_ofIPvJmjEEEPFT_DpT0_ES6_> 0xffffffffa0178827 <+119>: mov 0xc(%rsp),%esi 0xffffffffa017882b <+123>: cmp $0xffffffff811129b0,%rax 0xffffffffa0178831 <+129>: je 0xffffffffa01787c7 <_ZN7granary21wrapped_function_implILNS_19function_wrapper_idE5226ELNS_15runtime_contextE0ELb0ENS_23custom_wrapped_functionEJEE5applyEmj+23> 0xffffffffa0178833 <+131>: mov $0xffffffff811129b0,%rdi 0xffffffffa017883a <+138>: callq 0xffffffffa0178630 <_ZN7granary18dynamic_wrapper_ofIPvJmjEEEPFT_DpT0_ES6_> 0xffffffffa017883f <+143>: mov 0xc(%rsp),%esi 0xffffffffa0178843 <+147>: jmp 0xffffffffa01787ce <_ZN7granary21wrapped_function_implILNS_19function_wrapper_idE5226ELNS_15runtime_contextE0ELb0ENS_23custom_wrapped_functionEJEE5applyEmj+30> End of assembler dump. }

(6)p-trace output, seems additionally use a 'strcpy' cause to much more bb. I will simplify testmod by deleting the 'strcpy' and do things again.

{ (gdb) p-trace 20 Global code cache lookup trace: [1] 0xffffffffa022b3d0 [2] 0xffffffffa022b3cb [3] 0xffffffffa022b3b5 [4] 0xffffffffa022b4c5 [5] 0xffffffffa022b413 [6] 0xffffffffa022b405 [7] 0xffffffffa022b4c5 [8] 0xffffffffa022b413 [9] 0xffffffffa022b405 [10] 0xffffffffa022b4c5 [11] 0xffffffffa022b413 [12] 0xffffffffa022b405 [13] 0xffffffffa022b4c5 [14] 0xffffffffa022b413 [15] 0xffffffffa022b405 [16] 0xffffffffa022b4c5 [17] 0xffffffffa022b413 [18] 0xffffffffa022b405 [19] 0xffffffffa022b4c5 [20] 0xffffffffa022b413 }

So far, that wrapper function cannot be hit seems a problem.

renzhengeek commented 10 years ago

After simplify the test module, (1)test module source code: {

include <linux/init.h>

include <linux/module.h>

include <linux/kernel.h>

include <linux/slab.h>

//#include <linux/string.h>

MODULE_LICENSE("Dual BSD/GPL");

static char buf; static int testmod_init(void) { buf = (char )kmalloc(sizeof(char) * 100, GFP_KERNEL); //strcpy(buf, "This is a test module to test bounds checher functionality.\n"); //buf overfow
buf[200] = 'c'; //printk("buf: %s\n", buf); return 0; }

static void testmod_exit(void) { //printk("goodbye,kernel\n"); }

module_init(testmod_init); module_exit(testmod_exit); } (2)now, gdb part shows: { (gdb) p/d granary::DETACH_ID_kmalloc $1 = 5226 (gdb) p-wrapper granary::DETACHIDkmalloc Function wrapper __kmalloc (5226): Original address: 0xffffffff811129b0 App Wrapper address: 0xffffffffa01a07b0 Host Wrapper address: (nil) (gdb) b *0xffffffffa01a07b0 Breakpoint 2 at 0xffffffffa01a07b0: file /home/renzhen/granary/clients/watchpoints/clients/bounds_checker/kernel/linux/wrappers.h, line 28. (gdb) p-trace 20 Global code cache lookup trace: [1] 0xffffffffa0253085 [2] 0xffffffffa0253063 [3] 0xffffffffa0253045 (gdb) p-trace-entry-regs 3 Regs: r15: 0xffff880046d1fef0 r14: 0x1 r13: 0x0 r12: 0xffffffffa0113000 r11: 0x1 r10: 0xffff88006321e000 r9: 0x1 r8: 0x0 rdi: 0xffffffffa0113000 rsi: 0xfb rbp: 0xffff880046d1fe38 rbx: 0xffffffffa0115018 rdx: 0x2 rcx: 0x670 rax: 0xffff880046d1ffd8 rsp: 0xffff880046d1fe10 (gdb) p-trace-entry-bb 3 Translated instructions: 0xffffffffa0253040 granary::detail::EXECUTABLE_AREA+64: callq 0xffffffffa064edb0 granary::detail::EXECUTABLE_AREA+4177328 0xffffffffa0253045 granary::detail::EXECUTABLE_AREA+69: mov -0x1e6598b4(%rip),%rdi # 0xffffffff81bf9798 0xffffffffa025304c granary::detail::EXECUTABLE_AREA+76: push %rbp 0xffffffffa025304d granary::detail::EXECUTABLE_AREA+77: mov $0x10,%eax 0xffffffffa0253052 granary::detail::EXECUTABLE_AREA+82: mov %rsp,%rbp 0xffffffffa0253055 granary::detail::EXECUTABLE_AREA+85: test %rdi,%rdi 0xffffffffa0253058 granary::detail::EXECUTABLE_AREA+88: je 0xffffffffa064d000 granary::detail::EXECUTABLE_AREA+4169728

Original instructions: 0xffffffffa0113000: mov -0x1e51986f(%rip),%rdi # 0xffffffff81bf9798 0xffffffffa0113007: push %rbp 0xffffffffa0113008: mov $0x10,%eax 0xffffffffa011300d: mov %rsp,%rbp 0xffffffffa0113010: test %rdi,%rdi 0xffffffffa0113013: je 0xffffffffa011301f

$2 = 281473367224320

Basic block info: State: (nil) App: Code: 0xffffffffa0113000 Num instructions: 6 Code cache: Code: 0xffffffffa0253040 Num instructions: 7 Policy properties: Is in XMM context: 1 Is in host context: 1 Accesses user data: 1 Return address in code cache: 1 Num blocks in trace: 2 Exception table entry: (nil) Policy ID: 255 Instrumentation Function: irq_stack_union in section .data..percpu of /home/renzhen/linux-3.8/vmlinux __UNIQUE_ID_license0 in section .modinfo of /home/renzhen/granary/bin/granary.ko (gdb) (gdb) p-trace-entry-regs 2 Regs: r15: 0xffff880046d1fef0 r14: 0x1 r13: 0x0 r12: 0xffffffffa0113000 r11: 0x1 r10: 0xffff88006321e000 r9: 0x1 r8: 0x0 rdi: 0xffff88007d001800 rsi: 0xfb rbp: 0xffff880046d1fe08 rbx: 0xffffffffa0115018 rdx: 0x2 rcx: 0x670 rax: 0x10 rsp: 0xffff880046d1fe08 (gdb) p-trace-entry-bb 2 Translated instructions: 0xffffffffa025305e granary::detail::EXECUTABLE_AREA+94: callq 0xffffffffa064edb0 granary::detail::EXECUTABLE_AREA+4177328 0xffffffffa0253063 granary::detail::EXECUTABLE_AREA+99: mov $0xd0,%esi 0xffffffffa0253068 granary::detail::EXECUTABLE_AREA+104:
callq 0xffffffffa01c0280 <_ZN7granary21wrapped_function_implILNS_19function_wrapper_idE4942ELNS_15runtime_contextE0ELb0ENS_23custom_wrapped_functionEJEE5applyEP10kmem_cachej> 0xffffffffa025306d granary::detail::EXECUTABLE_AREA+109: jmpq 0xffffffffa0253080 granary::detail::EXECUTABLE_AREA+128

Original instructions: 0xffffffffa0113015: mov $0xd0,%esi 0xffffffffa011301a: callq 0xffffffff811125f0

$3 = 72339067405152277

Basic block info: State: (nil) App: Code: 0xffffffffa0113015 Num instructions: 2 Code cache: Code: 0xffffffffa025305e Num instructions: 4 Policy properties: Is in XMM context: 1 Is in host context: 1 Accesses user data: 1 Return address in code cache: 1 Num blocks in trace: 2 Exception table entry: (nil) Policy ID: 255 Instrumentation Function: irq_stack_union in section .data..percpu of /home/renzhen/linux-3.8/vmlinux __UNIQUE_ID_license0 in section .modinfo of /home/renzhen/granary/bin/granary.ko (gdb) (gdb) p-trace-entry-regs 1 Regs: r15: 0xffff880046d1fef0 r14: 0x1 r13: 0x0 r12: 0xffffffffa0113000 r11: 0x0 r10: 0x1800000000000 r9: 0xffffffffa025306d r8: 0x80 rdi: 0xffffffffa025306d rsi: 0x46ce0c80 rbp: 0xffff880046d1fe08 rbx: 0xffffffffa0115018 rdx: 0x800000000000 rcx: 0x0 rax: 0x880046ce0c80 rsp: 0xffff880046d1fe08 (gdb) p-trace-entry-bb 1 Translated instructions: 0xffffffffa0253080 granary::detail::EXECUTABLE_AREA+128: callq 0xffffffffa064edb0 granary::detail::EXECUTABLE_AREA+4177328 0xffffffffa0253085 granary::detail::EXECUTABLE_AREA+133: lea 0xc8(%rax),%rbp 0xffffffffa025308c granary::detail::EXECUTABLE_AREA+140: bt $0x30,%rbp 0xffffffffa0253091 granary::detail::EXECUTABLE_AREA+145: jb 0xffffffffa02530a6 granary::detail::EXECUTABLE_AREA+166 0xffffffffa0253097 granary::detail::EXECUTABLE_AREA+151: callq 0xffffffffa01564a0 0xffffffffa025309c granary::detail::EXECUTABLE_AREA+156: bswap %rbp 0xffffffffa025309f granary::detail::EXECUTABLE_AREA+159: mov $0xffff,%bp 0xffffffffa02530a3 granary::detail::EXECUTABLE_AREA+163: bswap %rbp 0xffffffffa02530a6 granary::detail::EXECUTABLE_AREA+166: movb $0x63,0x0(%rbp) 0xffffffffa02530aa granary::detail::EXECUTABLE_AREA+170: mov %rax,-0x13de79(%rip) # 0xffffffffa0115238 0xffffffffa02530b1 granary::detail::EXECUTABLE_AREA+177: xor %eax,%eax 0xffffffffa02530b3 granary::detail::EXECUTABLE_AREA+179: pop %rbp 0xffffffffa02530b4 granary::detail::EXECUTABLE_AREA+180: retq

Original instructions: 0xffffffffa011301f: movb $0x63,0xc8(%rax) 0xffffffffa0113026: mov %rax,0x220b(%rip) # 0xffffffffa0115238 0xffffffffa011302d: xor %eax,%eax 0xffffffffa011302f: pop %rbp 0xffffffffa0113030: retq

$4 = 72339067405152287

Basic block info: State: (nil) App: Code: 0xffffffffa011301f Num instructions: 5 Code cache: Code: 0xffffffffa0253080 Num instructions: 13 Policy properties: Is in XMM context: 1 Is in host context: 1 Accesses user data: 1 Return address in code cache: 1 Num blocks in trace: 1 Exception table entry: (nil) Policy ID: 255 Instrumentation Function: irq_stack_union in section .data..percpu of /home/renzhen/linux-3.8/vmlinux __UNIQUE_ID_license0 in section .modinfo of /home/renzhen/granary/bin/granary.ko }

(3)rest steps: (gdb) p client::wp::NEXT_COUNTER_INDEX $5 = {<std::__atomic_base> = {_M_i = 1}, } (gdb) p client::wp::DESCRIPTORS[0] $6 = {{lower_bound = 1187908736, upper_bound = 1187908864}, {return_address = 0xffffffffa025306d, next_free_descriptor = 0xffffffffa025306d}}

All the best for you:)

renzhengeek commented 10 years ago

Now,kmalloc can be hit. In fact, kmalloc is a macro func.It expands to invoke __kmalloc or kmem_cache_alloc, depending on its first parameter is a constant value or a variable value.

gdb shows: { (gdb) c Continuing. [Switching to Thread 1]

Breakpoint 5, _ZN7granary21wrapped_function_implILNS_19function_wrapper_idE4942ELNS_15runtime_contextE0ELb0ENS_23custom_wrapped_functionEJEE5applyEP10kmem_cachej () at /home/renzhen/granary/clients/watchpoints/kernel/linux/wrappers.h:63 63 POINTER_WRAPPER({ (gdb) n 157 FUNCTION_WRAPPER(WRAP_CONTEXT, kmem_cachealloc, (void ), (struct kmem_cache cache, gfp_t gfp), { (gdb) 63 POINTER_WRAPPER({ (gdb) 157 FUNCTION_WRAPPER(WRAP_CONTEXT, kmem_cache_alloc, (void ), (struct kmem_cache cache, gfp_t gfp), { (gdb) 63 POINTER_WRAPPER({ (gdb) 277 const uintptr_t masked_ptr(ptr & MASK_47_48); (gdb) 308 if(is_watched_address(ptr)) { (gdb) 63 POINTER_WRAPPER({ (gdb) 157 FUNCTION_WRAPPER(WRAP_CONTEXT, kmem_cache_alloc, (void ), (struct kmem_cache cache, gfp_t gfp), { (gdb) 293 return ptr | (~CLEAR_INDEX_MASK); (gdb) p ptr No symbol "ptr" in current context. (gdb) n 157 FUNCTION_WRAPPER(WRAP_CONTEXT, kmem_cache_alloc, (void ), (struct kmem_cache cache, gfp_t gfp), { (gdb) 293 return ptr | (~CLEAR_INDEX_MASK); (gdb) 157 FUNCTION_WRAPPER(WRAP_CONTEXT, kmem_cache_alloc, (void ), (struct kmem_cache cache, gfp_t gfp), { (gdb) 0xffffffffa022a06d in granary::detail::EXECUTABLE_AREA () (gdb) Single stepping until exit from function _ZN7granary6detailL15EXECUTABLE_AREAE, which has no line number information. granary::trace_log::addentry(unsigned char, granary::simple_machine_state) () at /home/renzhen/granary/granary/trace_log.cc:56 56 ) throw() { (gdb) 614 { return atomic_fetch_add(&_M_i, i, m); } (gdb) 56 ) throw() { (gdb) 62 trace_log_item *prev(nullptr); (gdb) 614 { return atomic_fetch_add(&_M_i, i, m); } (gdb) 72 memcpy(&(item->state), state, sizeof *state); (gdb) 64 NUM_TRACE_ENTRIES.fetch_add(1) % CONFIG_DEBUG_NUM_TRACE_LOG_ENTRIES])); (gdb) 65 item->code_cache_addr = code_cache_addr; (gdb) p item No symbol "item" in current context. }

All best for you ;)

pgoodman commented 10 years ago

Sorry for the late reply; I took the weekend off ;-)

Let's really simplify this. Get rid of the printk and the strcpy. Please use the following code for debugging:

static int testmod_init(void) {
  buf = (char *)kmalloc(sizeof(char) * 1, GFP_KERNEL);
  buf[1] = (char) 0xFF;  // overflow
  return 0;
}

Okay, I see that with the p-trace you've got way more basic blocks than I anticipated. Lets re-focus on the earliest executed blocks (big numbers), instead of the most recently executed blocks (small numbers).

The above code should ideally result in only one basic block. The basic block should look something like this (I've just put in semi-arbitrary register names for the sake of the example).

  mov $1, $rdi;  // arg1 = kmalloc size
  mov $..., $rsi; // arg2 = GFP_KERNEL

  // Call the wrapped version of kmalloc
  callq 0xfff.. <_ZN7granary21wrapped_function_...E4942ELNS...>

  // Copy the returned address into the buf variable.
  mov %rax, 0xfff....

  // Compute the effective address of &(buff[1]). This assumes
  // that rax contains &(buff[0]), and that rbx (or some other register)
  // is dead. If no dead registers can be found then some slightly
  // more complicated code is used to save/restore registers.
  lea 0x1(%rax), %rbx;

  // Test bit 48 of %rbx to see if it's a watched address
  bt $0x30,%rbx
  jb not_watched

  // Fall-through: address is watched. Call out to a function that
  // will check for an overflow
  callq 0xffffffffa01564a0

  // Now emulate the original address by "unwatching" it. This masks the
  // 16 high-order bits with 1.
  bswap %rbx
  mov $0xffff,%bx
  bswap %rbx
  // Fall through to execute the instruction.

not_watched:
  movb $0xFF, (%rbx)  // Emulated instruction.

Okay, so, hopefully that little bit of code explains the general structure of the instrumentation. What we care about is adding a breakpoint on that callq 0xffffffffa01564a0 (the address will vary each time the module is loaded). If the breakpoint is hit, then the address is watched, and that code should go to actually check if there's an overflow.

There's a few ways we can go about adding a breakpoint.

The first way is to indirectly add the breakpoint at the actual routine that does the checking. We'll do this by loading granary, adding the breakpoint with b granary_bounds_check_1, then initializing granary, and then loading testmod. This function used is a generic bounds checker for a single-byte (hence the 1) read/write. There are equivalent versions for 2-, 4-, and 8-byte reads/writes.

An alternative way to add breakpoints into the actual code cache code is to force Granary to tell you about a newly translated basic block. In this case, we know that there's going to be a memory write, so we'll modify the source code at https://github.com/Granary/granary/blob/master/clients/watchpoints/clients/bounds_checker/instrument.cc#L219 by adding the following line of code:

granary_do_break_on_translate = true;

Then compile, load, and initialize Granary, then load testmod. What you should observe is that when testmod is loaded, a breakpoint on the function granary_break_on_translate is automatically hit. The first (and only) argument to the function (addr) is the address of the basic block in the code cache. In GDB, you should be able to do p-bb addr. If that does work, then do:

(gdb) n
(gdb) p-bb addr

This extra n (step to next line) is because, depending on how things have been compiled, the breakpoint might be at a location where the first argument is uninitialized (from GDBs perspective). The mechanism that makes this work is found in: https://github.com/Granary/granary/blob/master/granary/code_cache.cc#L207. Modifying granary_do_break_on_translate in the code can be a good way to do "narrow down" debugging, where you insert conditions in the code that apply/don't apply the instrumentation to certain places until you've narrowed down to cases where you only hit a bug. Then for those cases you set the variable to true, to inspect them manually.

That should give you the print-out of the instructions in the instrumented and original basic block. Then you can add breakpoints as you see fit, e.g. adding a breakpoint onto the callq 0xfff... that was inside the watchpoint check.

When you've added all of your breakpoints, run c (continue) from GDB, and hopefully you will see your breakpoints hit.

The next question is really: why isn't it invoking client::wp::visit_overflow (i.e. why isn't it detecting an overflow). There could be an error in the logic of the generic bounds checker functions. To determine that, you'd need to step through the instructions.

Here's how the checker works: 1) The bounds checker code first extracts the "index" of the descriptor (https://github.com/Granary/granary/blob/master/clients/watchpoints/clients/bounds_checker/bounds_checkers.asm#L77). The index is stored in the high-order 15 bits of the watched address. 2) Then, we get the address of the descriptor, &(client::wp::DESCRIPTORS[index]) (https://github.com/Granary/granary/blob/master/clients/watchpoints/clients/bounds_checker/bounds_checkers.asm#L82). 3) Then, we compare the low-order 32 bits of the watched address with the low-order 32 bits of the "base" address of the object, which are stored in the descriptor (https://github.com/Granary/granary/blob/master/clients/watchpoints/clients/bounds_checker/bounds_checkers.asm#L87). This is to calculate a buffer underflow. Thus, the bounds checker won't actually work across a 4GB byte boundary. This tradeoff was made for space concerns. In practice though, storing a 64 bit base and limit address are fine. 4) If we didn't detect an underflow, then compare the low-order 32 bits of the watched address against the stored 32 bits of the "limit" address of the object (https://github.com/Granary/granary/blob/master/clients/watchpoints/clients/bounds_checker/bounds_checkers.asm#L91). 5) If an overflow or underflow is detected, then jmp to https://github.com/Granary/granary/blob/master/clients/watchpoints/clients/bounds_checker/bounds_checkers.asm#L25. 6) That function should do a function call to client::wp::visit_overflow (https://github.com/Granary/granary/blob/master/clients/watchpoints/clients/bounds_checker/bounds_checkers.asm#L39).

renzhengeek commented 10 years ago

Thanks very much ;)

(1)Already try your simplified testmod.

(2)GDB actually hit the wrapper func,but it seems useless to 'step debug', because lots of data cannot be printed only to get "No symbol "**" in current context.". things like below: { (gdb) p-wrapper granary::DETACH_ID_kmalloc Function wrapper kmalloc (5226): Original address: 0xffffffff811129b0 App Wrapper address: 0xffffffffa01e77b0 Host Wrapper address: (nil) (gdb) p-wrapper granary::DETACH_ID_kmem_cache_alloc Function wrapper kmem_cache_alloc (4942): Original address: 0xffffffff811125f0 App Wrapper address: 0xffffffffa0207280 Host Wrapper address: (nil) (gdb) b 0xffffffffa0207280 Breakpoint 3 at 0xffffffffa0207280: file /home/renzhen/granary/clients/watchpoints/kernel/linux/wrappers.h, line 63. (gdb) c Continuing.

Breakpoint 3, _ZN7granary21wrapped_function_implILNS_19function_wrapper_idE4942ELNS_15runtime_contextE0ELb0ENS_23custom_wrapped_functionEJEE5applyEP10kmem_cachej () at /home/renzhen/granary/clients/watchpoints/kernel/linux/wrappers.h:63 63 POINTER_WRAPPER({ (gdb) n 157 FUNCTION_WRAPPER(WRAP_CONTEXT, kmem_cache_alloc, (void ), (struct kmem_cache cache, gfp_t gfp), { (gdb) 63 POINTER_WRAPPER({ (gdb) 157 FUNCTION_WRAPPER(WRAP_CONTEXT, kmem_cache_alloc, (void ), (struct kmem_cache cache, gfp_t gfp), { (gdb) 63 POINTER_WRAPPER({ (gdb) 277 const uintptr_t masked_ptr(ptr & MASK_47_48); (gdb) 308 if(is_watched_address(ptr)) { (gdb) 63 POINTER_WRAPPER({ (gdb) 157 FUNCTION_WRAPPER(WRAP_CONTEXT, kmem_cache_alloc, (void ), (struct kmem_cache cache, gfp_t gfp), { (gdb) 293 return ptr | (~CLEAR_INDEX_MASK); (gdb) p ptr No symbol "ptr" in current context. .... }

(3)after 'b granary_bounds_check_1', gdb really hit the breakpoint, but cannot go to where bouds being checked. { Breakpoint 2, granary_bounds_check_1 () at clients/watchpoints/clients/bounds_checker/bounds_checkers.asm:118 118 (gdb) n granary_bounds_check_1 () at clients/watchpoints/clients/bounds_checker/bounds_checkers.asm:119 119 /// Define a bounds checker and splat the rest of the checkers. (gdb) granary_bounds_check_1 () at clients/watchpoints/clients/bounds_checker/bounds_checkers.asm:120 120 #define DEFINE_CHECKERS(reg, rest) \ (gdb) granary_bounds_check_1 () at clients/watchpoints/clients/bounds_checker/bounds_checkers.asm:121 121 DEFINE_CHECKER(reg) \ (gdb) n 126 #define DEFINE_CHECKER(reg) \ (gdb) n 128 BOUNDS_CHECKER(reg, 2) \ (gdb) n 130 BOUNDS_CHECKER(reg, 8) \ (gdb) 137 GLOBAL_LABEL(granary_last_bounds_checker:) (gdb) 138 END_FILE }

So, I am wondering that this two question is common for everyone, or only for me cause I do something wrong.

pgoodman commented 10 years ago

I suggest disassembling granary_bounds_check_1 and then manually adding breakpoints at the addresses of the instructions doing the comparisons. Then you'll be able to see what's being compared, and compare that to what is in DESCRIPTORS[0].

On 24 March 2014 22:38, Ren Zhen notifications@github.com wrote:

Thanks very much ;)

(1)Already try your simplified testmod.

(2)GDB actually hit the wrapper func,but it seems useless to 'step debug', because lots of data cannot be printed only to get "No symbol "***" in current context.". things like below:

{ (gdb) p-wrapper granary::DETACH_ID_kmalloc Function wrapper kmalloc (5226): Original address: 0xffffffff811129b0 App Wrapper address: 0xffffffffa01e77b0 Host Wrapper address: (nil) (gdb) p-wrapper granary::DETACH_ID_kmem_cache_alloc Function wrapper kmem_cache_alloc (4942): Original address: 0xffffffff811125f0 App Wrapper address: 0xffffffffa0207280 Host Wrapper address: (nil) (gdb) b *0xffffffffa0207280 Breakpoint 3 at 0xffffffffa0207280: file /home/renzhen/granary/clients/watchpoints/kernel/linux/wrappers.h, line 63. (gdb) c Continuing.

Breakpoint 3, _ZN7granary21wrapped_function_implILNS_19function_wrapper_idE4942ELNS_15runtime_contextE0ELb0ENS_23custom_wrapped_functionEJEE5applyEP10kmem_cachej () at /home/renzhen/granary/clients/watchpoints/kernel/linux/wrappers.h:63 63 POINTER_WRAPPER({ (gdb) n 157 FUNCTION_WRAPPER(WRAP_CONTEXT, kmem_cache_alloc, (void ), (struct kmem_cache cache, gfp_t gfp), { (gdb) 63 POINTER_WRAPPER({ (gdb) 157 FUNCTION_WRAPPER(WRAP_CONTEXT, kmem_cache_alloc, (void ), (struct kmem_cache cache, gfp_t gfp), { (gdb) 63 POINTER_WRAPPER({ (gdb) 277 const uintptr_t masked_ptr(ptr & MASK_47_48); (gdb) 308 if(is_watched_address(ptr)) { (gdb) 63 POINTER_WRAPPER({ (gdb) 157 FUNCTION_WRAPPER(WRAP_CONTEXT, kmem_cache_alloc, (void ), (struct kmem_cache cache, gfp_t gfp), { (gdb) 293 return ptr | (~CLEAR_INDEX_MASK); (gdb) p ptr No symbol "ptr" in current context. .... }

(3)after 'b granary_bounds_check_1', gdb really hit the breakpoint, but cannot go to where bouds being checked. { Breakpoint 2, granary_bounds_check_1 () at clients/watchpoints/clients/bounds_checker/bounds_checkers.asm:118 118 (gdb) n granary_bounds_check_1 () at clients/watchpoints/clients/bounds_checker/bounds_checkers.asm:119 119 /// Define a bounds checker and splat the rest of the checkers. (gdb) granary_bounds_check_1 () at clients/watchpoints/clients/bounds_checker/bounds_checkers.asm:120 120 #define DEFINE_CHECKERS(reg, rest) \ (gdb) granary_bounds_check_1 () at clients/watchpoints/clients/bounds_checker/bounds_checkers.asm:121 121 DEFINE_CHECKER(reg) \ (gdb) n 126 #define DEFINE_CHECKER(reg) \ (gdb) n 128 BOUNDS_CHECKER(reg, 2) \ (gdb) n 130 BOUNDS_CHECKER(reg, 8) \ (gdb) 137 GLOBAL_LABEL(granary_last_bounds_checker:) (gdb) 138 END_FILE }

So, I am wondering that this two question is common for everyone, or only for me cause I do something wrong.

Reply to this email directly or view it on GitHubhttps://github.com/Granary/granary/issues/20#issuecomment-38525216 .

renzhengeek commented 10 years ago

Hi, after adding granary_do_break_on_translate = true;,then do compile, load and initialise granary, insmod testmod. And gdb hit the granary_break_on_translate. BUT when do n p-bb addr the hell info occurs "No symbol "addr" in current context." do copy and paste from gdb: { (gdb) c Continuing. [New Thread 2] [Switching to Thread 2]

Breakpoint 4, granary_break_on_translate () at /home/renzhen/granary/granary/code_cache.cc:32 32 void granary_break_on_translate(void *addr) { (gdb) n 33 ASM("nop;"); (gdb) p-bb addr No symbol "addr" in current context. }

Thank you :)

renzhengeek commented 10 years ago

Hi pgoodman, Applogize for my poor assasembly language. About how the checker works: 1) I am not clear about this asm code. The way Index computed seems not as the same as the scheme showed in the article "debuging with behaviour watchpoits", isn't? Also, I don't understand the way compute '&(DESCRIPTORS[0]' by the instr of 'lea DESCRIPTORS(%rip), %rsi;' . I always think of &(DESCRIPTORS[0] has the same addr with DESCRIPTORS.

Could you give some explaination?

The code block: { COMMENT(Get the index into RDX.) @N@\ mov %rdi, %rdx; COMMENT(Copy the watched address into RDX.)@N@\ shr $49, %rdx; COMMENT(Shift the index into the low 15 bits.)@N@\ shl $4, %rdx; COMMENT(Scale the index by sizeof(bound_descriptor).)@N@\ @N@\ COMMENT(Get a pointer to the descriptor. The descriptor table is an array) @N@\ COMMENT(of bound_descriptor structures, each of which is 16 bytes.) @N@\ lea DESCRIPTORS(%rip), %rsi; @N@\ add %rdx, %rsi; COMMENT(Add the scaled index to &(DESCRIPTORS[0]).)@N@\ }

Thank you:)

pgoodman commented 10 years ago

That's okay. You should be able to do p-bb $rdi instead of n; p-bb addr. The reason why is that addr is the first argument, and the first argument is passed in the register %rdi. I suggest going to Agner Fog's website and looking for a document that describes calling conventions or register usage conventions or ABIs. The document can be found here: http://www.agner.org/optimize/calling_conventions.pdf.

pgoodman commented 10 years ago

So, DESCRIPTORS is an array. Array variables "decay" into pointers, and so &(DESCRIPTORS[0]) and DESCRIPTORS can be mostly interchanged. I use the former to be explicit about the address of the first element.

The lea instruction, which stands for "load effective address", computes a memory address, but does access that memory. DESCRIPTORS(%rip) is telling lea to find the address of the DESCRIPTORS symbol (which will be the address of the first element of the array), relative to the instruction pointer (%rip).

The above code does the following: &(DESCRIPTORS[((uintptr_t) addr) >> 49]). In the paper, we describe a watchpoint descriptor's index as having two parts: the counter index (high-order 15 bits), and the inherited index (some sequence of low order bits). For simplicity, the bounds checker doesn't make use of the inherited index.

pgoodman commented 10 years ago

By the way, one suggestion for when you're trying to step through an assembly function (like one of the bounds checkers), is to add a breakpoint at the address of an instruction, and then in gdb, issue the following commands until your terminal switches into two panes (one for commands, one showing you the assembly instructions): layout next. Then, you can single-step instructions using ni (next instruction), inspect individual registers (e.g. %rsi) by doing p/x $rsi, and if you want to see all registers and the flags you can do info reg.

renzhengeek commented 10 years ago

Clearly. Thank you very much.

Another question is about deferrence bettween twice bounds checks,one in bounds_checkers.asm,other in visit_overflow.

1)In bounds_checkers.asm: (%rsi) is the same as DESCRIPTORS[idex],%edi is the lower 32bits of the watched address,isn't? lower_bound is a member of struct bound_descriptor,then how can we compare them and determin wether it is underflow or overfow?

{ COMMENT(Check the lower bounds against the low 32 bits of the watched address.) @N@\ cmp (%rsi), %edi; @N@\ jl .CAT(Lgranaryunderflow, size); @N@\ }

2)In visit_overflow: It do actually comparation in here.

Could you give some complanation about how it works?

Thank you:)

pgoodman commented 10 years ago

When we take a look at https://github.com/Granary/granary/blob/master/clients/watchpoints/clients/bounds_checker/instrument.h we see that lower_bound is the first thing in the struct (and so the address of lower bound and the address of the struct are the same) and upper_bound is 4 bytes offset from the address of the beginning of the struct. You can check this from GDB by doing:

p/x &(client::wp::DESCRIPTORS[0]])
p/x &(client::wp::DESCRIPTORS[0]].lower_bound)
p/x &(client::wp::DESCRIPTORS[0]].upper_bound)

The watched address will be located in the %rdi register, which is a 64-bit general purpose register. When we take a look at this chart (http://sandpile.org/x86/gpr.htm), we can see that the low 32 bits of the %rdi register can be accessed with the %edi register.

So, we compare the low 32 bits of the watched address (%edi) with the low 32 bits of the address as returned by kmalloc (or another allocator, see the initialization here: https://github.com/Granary/granary/blob/master/clients/watchpoints/clients/bounds_checker/instrument.cc#L154), which is stored in (%rsi) (i.e. client::wp::DESCRIPTORS[addr >> 49].lower_bound).

Lets make this a bit more concrete. Suppose that the actual call to kmalloc(0x10, GFP_KERNEL) returns the address 0xffffffffabcdef00. In the wrapper for __kmalloc (https://github.com/Granary/granary/blob/master/clients/watchpoints/clients/bounds_checker/kernel/linux/wrappers.h#L34), we then have ret = 0xffffffffabcdef00, size = 0x10, and ret_address = ... (the return address after the call of __kmalloc; this is here for future error reporting).

Then we do add_watchpoint(ret, ret, size, ret_address). This says "add a watchpoint to ret (first argument is an in/out parameter), and initialize the watchpoint descriptor by passing along ret, size, and ret_address". This connects to https://github.com/Granary/granary/blob/master/clients/watchpoints/clients/bounds_checker/kernel/linux/wrappers.h#L34 through a series of overcomplicated template instantiations, where base_address = ret.

Lets say we allocate a descriptor at index 0x11 (this assignment happens via some reference arguments, again, I am not pleased with how I designed this). We store the low 32 bits (0xabcdef00) of base_address into the lower_bound field, and then store the low 32 bits of base_address + size (0xabcdef10) into upper_bound.

Our index 0x11 gets mangled into the base_address (i.e. ret) by add_watchpoint by doing:

watched_address = (unwatched_address & (~0xFFFF << 48)) | (index << 49);

Okay, so, how does the actual bounds checking work? Lower and upper bound partially define (by being only 32 bits each) the min and max addresses that we can access. In practice, we don't need this high-order 32 bits.

Lets say we're doing an 8-byte read of the address 0x0022ffffabcdef0A. We've got 0x22 up there because our index 0x11 has been shifted into the 49th bit position. We first compare against the lower bound. Here we want to make sure that 0xabcdef0A > 0xabcdef00. This is true, so there's no underflow and we proceed to compare against the upper bound. We need to be careful to ensure that the all 8 bytes of our memory access fall within the bounds. So we subtract the access size, 8, from the stored upper bound (0x4(%rsi), now in %rsi), which ends up being 0xabcdef08. This is the maximum upper bound that can be read from/written to with an 8-byte memory operation. We then check if 0xabcdef0A > 0xabcdef08, and if so then we have an overflow. We happen to have an overflow in this case.

renzhengeek commented 10 years ago

Thank you very much.

There is still a few things I cannot understand. what is the meaning of "mangle"? the word 'mangle' offen occurs in source code. What I think is combining info together, like "encode" things, isn't? I can hardly understand why we want to mangle like this.I am always thinking of we distinct unwatched_addr from watched_address by seting bit 63 to 0,which violate canonical address.

Could you show me the purpose by doing the below: 'watched_address = (unwatched_address & (~0xFFFF << 48)) | (index << 49);'.

Suppose that unwatched_address is 0x0022ffffabcdef00, the watched_address is 0x0. I will further debug there to verify this.

Thank you :)

renzhengeek commented 10 years ago

This problem is very strange.

In testmod, We kmalloc 1 byte memory,but in DESCRIPTORS[0], upper_bound-lower_bound = 8 byte.

{ (gdb) p client::wp::DESCRIPTORS[0] $4 = {{lower_bound = 2078902352, upper_bound = 2078902360}, {return_address = 0xffffffffa029a06d, next_free_descriptor = 0xffffffffa029a06d}} }

No wonder when we visit buf[1] , out of bound is not detected. But I also tried buf[10]=0xff, it is still not be detected!

What do you think about this? And I will go further to narrow the bug.

Thank you ;)

pgoodman commented 10 years ago

Great catch on me using an unconditional branch instead of a conditional one. I would be interested in seeing if you start catching real live kernel bugs. I have merged your pull request.

I'm not sure why the lower bound and upper bound are off by 8. It is worth adding a breakpoint into the function that initializes lower_bound and upper_bound. To make sure that GDB can access a variable, e.g. foo, make sure the following appears in the code after where you will place your breakpoint: USED(var);. I also suggest modifying the __kmalloc wrapper, adding a USED(size) after the call to add_watchpoint, so that you can bt (backtrace) to see what led to your breakpoint, then, do f N (e.g. f 5) to go down to the fifth stack frame and print the local size variable using p/d size (print size as a decimal number).

The USED is sort of a hack (https://github.com/Granary/granary/blob/master/granary/pp.h#L105) to really force a variable to remain live, and potentially have its value stored in memory on the stack, for the sake of GDB-based debugging. Sometimes, because of compiler optimizations, you won't be able to inspect the value of a variable after its last use. USED is a way of creating another use, and serves to document that it's for debugging.

To address your earlier question: we actually distinguish watched and unwatched addresses by checking bit 48, not bit 63. So, step-by-step:

Mangle is a term that I inherited from DynamoRIO, and then re-purposed for many uses.

renzhengeek commented 10 years ago

Thank you very much ;)