kubo / funchook

Hook function calls by inserting jump instructions at runtime
Other
650 stars 94 forks source link

Replace a function without reference to it #27

Open reedjosh1 opened 3 years ago

reedjosh1 commented 3 years ago

I need to replace a function in a shared object library (ruby). I'm not sure how to get a function handle for the particular function so that I can replace it.

It's buried in the SO.

Can you please point me in the right direction? I can find the function in the SO via readelf, but the address doesn't match at runtime.

kubo commented 3 years ago

It's buried in the SO.

Does this mean that dlsym(RDTL_DEFAULT, "function_name") returns NULL?

I can find the function in the SO via readelf, but the address doesn't match at runtime.

The address is relative to the base address. You can get it if the platform is Linux.

  1. open /proc/self/maps and search the shared library name.
  2. search the base address of the ruby library by iterating the linked list of loaded modules.

    #include <stdio.h>
    #include <string.h>
    #include <link.h>
    
    int main()
    {
        struct link_map *lm;
        size_t relative_function_address = ???;
    
        for (lm = _r_debug.r_map; lm != NULL; lm = lm->l_next) {
            if (strstr(lm->l_name, "libruby.so") != NULL) {
                printf("base address: %lx\n", lm->l_addr);
                printf("function address: %lx\n", lm->l_addr + relative_function_address);
            }
        }
    }
  3. use dlinfo to get the link_map structure.
  4. use dladdr1 to get the link_map structure.
reedjosh1 commented 3 years ago

Thank you so much! I'm trying this right now and will get back to you as soon as I fully wrap my head aound this.

Does this mean that dlsym(RDTL_DEFAULT, "function_name") returns NULL?

Yes, just verified that.

I can use your above search method to find the address of the function in question, and I can even call it, but puting the address into patch_via_funchook() (our internal helper for funchook) doesn't seem to replace it.

   24 void patch_via_funchook(void *original_function, void *hook_function) {
   25     VALUE funchook_module_wrapper = rb_define_module("Funchook");
   26     funchook_path = rb_iv_get(funchook_module_wrapper, "@path");
   27
   28     void *funchook_lib_handle;
   29     void *funchook_reference, *(*funchook_create)(void);
   30     int prepareResult, (*funchook_prepare)(void *, void **, void *);
   31     int installResult, (*funchook_install)(void *, int);
   32
   33     funchook_lib_handle = dlopen(StringValueCStr(funchook_path), RTLD_NOW | RTLD_GLOBAL);
   34
   35     /* Load the funchook methods we need */
   36     funchook_create = (void *(*)(void))dlsym(funchook_lib_handle, "funchook_create");
   37     funchook_prepare = (int (*)(void *, void **, void *))dlsym( funchook_lib_handle, "funchook_prepare");
   38     funchook_install = (int (*)(void *, int))dlsym(funchook_lib_handle, "funchook_install");
   39
   40     funchook_reference = (void *)(*funchook_create)();
   41
   42     prepareResult = (*funchook_prepare)(funchook_reference, (void **)original_function, hook_function);
   43     installResult = (*funchook_install)(funchook_reference, 0);
   44 }

Which is probably because as far as I can tell our internal helper also relies on dlsym.

I'm researching further and will update again shortly. Once again, thanks for your help.

reedjosh1 commented 3 years ago

Hey, I reviewed what we're doing, and while it is a bit convoluted, the use of dlsym is only to get handles for funchook specific functions.

Our patch_via_funchook does work for replacing other functions, and matches the example usage in funchook's readme.

Still, the address of rb_hash_key_str does not seem to be working as a handle for method replacement.

The method in question can be found here: https://github.com/ruby/ruby/blob/5dde13e5ce7236d1de428f6a74f1043c6893bacf/hash.c#L2858

I'm not sure if I'm missing something in these statements, but I think these are just other ways to get an address?

use dlinfo to get the link_map structure. use dladdr1 to get the link_map structure.

The address is what funchook needs to replace a method right?

Please advise further, and thanks again!

kubo commented 3 years ago

Still, the address of rb_hash_key_str does not seem to be working as a handle for method replacement.

Could you explain about this? funchook_prepare fails with an error? Otherwise, it succeeds but rb_hash_key_str isn't hooked?

I'm not sure if I'm missing something in these statements, but I think these are just other ways to get an address?

Yes, they are other ways to find the base address of the ruby library.

reedjosh1 commented 3 years ago

Could you explain about this? funchook_prepare fails with an error? Otherwise, it succeeds but rb_hash_key_str isn't hooked?

prepare and install both succeed, but the function in question isn't hooked.

And I realize I should give you more info to be able to reproduce/investigate, so here's a better rundown.

Here is how I get the original (unrelocated) addresses of the functions in question.

➜  lib readelf -Ws libruby.so | grep -i rb_hash_key_str
  6433: 00000000001085c0    37 FUNC    LOCAL  DEFAULT   13 rb_hash_key_str
➜  lib readelf -Ws libruby.so | grep -i rb_p$
   362: 000000000011c4b0   247 FUNC    GLOBAL DEFAULT   13 rb_p
➜  lib readelf -Ws libruby.so | grep -i rb_hash_aset
  1050: 0000000000101ad0   341 FUNC    GLOBAL DEFAULT   13 rb_hash_aset
  7620: 0000000000101ad0   341 FUNC    GLOBAL DEFAULT   13 rb_hash_aset

And I just cleaned up and migrated the body of our funchook helper function into one function:

static int install_hooks() {

    // Original function addresses.
    size_t relative_function_address = 0x00000000001085c0; // rb_hash_key_str
    // size_t relative_function_address = 0x0000000000101ad0; // rb_hash_aset
    // size_t relative_function_address = 0x000000000011c4b0;  // rb_p

    // Search for relative address of function in shared object library...
    struct link_map *lm;
    size_t new_addr;

    for (lm = _r_debug.r_map; lm != NULL; lm = lm->l_next) {
        if (strstr(lm->l_name, "libruby.so") != NULL) {
            new_addr = lm->l_addr + relative_function_address;
            printf("base address: %lx\n", lm->l_addr);
            printf("function address: %lx\n", new_addr);
        }
    }

    // Create reference to original function.
    typedef VALUE func(VALUE);
    func* rb_hash_key_str_orig = (func*)new_addr;

    funchook_t *funchook_reference = funchook_create();

    int prepareResult = funchook_prepare(funchook_reference, (void **)&rb_hash_key_str_orig, rb_hash_key_str_hook);
    int installResult = funchook_install(funchook_reference, 0);

    printf("Hooking results...\n");
    printf("Prepare: %d\n", prepareResult);
    printf("Install: %d\n", installResult);
    fflush(stdout);

    return 0;
}

Where the hook looks like:

VALUE rb_hash_key_str_hook(VALUE key) {
    printf("asdf2\n"); fflush(stdout);
    return key;
}

The output from this is:

base address: 7f5f5239d000
function address: 7f5f524a55c0
Hooking results...
Prepare: 0
Install: 0

So, that's how I now know that it succeeds, but unfortunately the behavior of rb_hash_key_str remains unchanged, and I do not get flooded with "asdf2\n"

Now to test that the overall method is sound, I switched the function to look like:

static int install_hooks() {

    // Original function addresses.
    // size_t relative_function_address = 0x00000000001085c0; // rb_hash_key_str
    // size_t relative_function_address = 0x0000000000101ad0; // rb_hash_aset
    size_t relative_function_address = 0x000000000011c4b0;  // rb_p

    // Search for relative address of function in shared object library...
    struct link_map *lm;
    size_t new_addr;

    for (lm = _r_debug.r_map; lm != NULL; lm = lm->l_next) {
        if (strstr(lm->l_name, "libruby.so") != NULL) {
            new_addr = lm->l_addr + relative_function_address;
            printf("base address: %lx\n", lm->l_addr);
            printf("function address: %lx\n", new_addr);
        }
    }

    // Ruby test string.
    VALUE mystr = rb_sprintf("blah");
    rb_p(mystr);
    rb_p(mystr);
    rb_p(mystr);

    // Create reference to original function.
    typedef VALUE func(VALUE);
    func* rb_hash_key_str_orig = (func*)new_addr;

    funchook_t *funchook_reference = funchook_create();

    int prepareResult = funchook_prepare(funchook_reference, (void **)&rb_hash_key_str_orig, rb_hash_key_str_hook);
    int installResult = funchook_install(funchook_reference, 0);

    printf("Hooking results...\n");
    printf("Prepare: %d\n", prepareResult);
    printf("Install: %d\n", installResult);

    rb_p(mystr);
    rb_p(mystr);
    rb_p(mystr);

    fflush(stdout);

    return 0;
}

Which uses the same technique to find ruby's p method and replace it with the same hook. I then use p to print "blah" 3 times, patch ruby's p, and then do it again.

This time the results are as expected.

base address: 7feb39071000
function address: 7feb3918d4b0
"blah"
"blah"
"blah"
Hooking results...
Prepare: 0
Install: 0
asdf2
asdf2
asdf2

With ruby's p printing "asdf2\n" after the patch.