facebook / fishhook

A library that enables dynamically rebinding symbols in Mach-O binaries running on iOS.
BSD 3-Clause "New" or "Revised" License
5.17k stars 965 forks source link

fishhook conflict with address sanitizer #47

Open arronzhujf opened 6 years ago

arronzhujf commented 6 years ago

I hook socket relevant c function in my project, like getaddrinfo,connect,socket and so on. When I open address sanitizer for debug I found some error like this: image It seems that both address sanitizer and fishhook hooked getaddrinfo() , and result in a conflict.

The following WWDC video said sanitizer hook standard c library. reference: https://developer.apple.com/videos/play/wwdc2015/413/

dlow-yahoo-inc commented 6 years ago

I ran into the same problem. After some poking around, I think I understand what's causing the infinite recursion:

When building an app with "Thread Sanitizer" or "Address Sanitizer", the compiler generates a special DLL named libclang_rt.tsan_iossim_dynamic.dylib (tsan -> Thread Sanitizer) or libclang_rt.asan_iossim_dynamic.dylib (asan -> Address Sanitizer). These libraries contain wrapper functions corresponding to system APIs (eg. wrap_getaddrinfo() for getaddrinfo()).

These special DLLs are "interposed" by the linker on top of the system libraries before the app's symbols are resolved. The pseudocode for wrap_getaddrinfo() looks like:

int wrap_getaddrinfo(const char *node, const char *service,
                       const struct addrinfo *hints,
                       struct addrinfo **res)
{
  // Some instrumentation

  // Forward to original function, the linker is smart enough to bind this symbol to the next DLL
  return getaddrinfo(node, service, hints, res);
}

When Fishhook does it's rebinding, it does a massive search & replace of all references of getaddrinfo() to MAM_getaddrinfo(). Including the one in the specially generated DLL. Hence, leading to the infinite recursion.

Whatever the solution, it will involve breaking this cycle.

dlow-yahoo-inc commented 6 years ago

A dirty hack that seems to work is:

static void _rebind_symbols_for_image(const struct mach_header *header,
                                      intptr_t slide) {
    uint32_t c = _dyld_image_count();
    for (uint32_t i = 0; i < c; i++) {
        // HACK: Get file name of the mach header
        if (_dyld_get_image_header(i) == header) {
            const char *path = _dyld_get_image_name(i);
            const char *base = basename((char *)path);

            // Only rebind libraries that are not the special generated sanitizer ones
            if (strcmp(base, "libclang_rt.tsan_iossim_dynamic.dylib") &&
                strcmp(base, "libclang_rt.asan_iossim_dynamic.dylib"))
            {
                _rebind_symbols_for_image(_dyld_get_image_header(i), _dyld_get_image_vmaddr_slide(i));
            }
        }
    }
}
int rebind_symbols(struct rebinding rebindings[], size_t rebindings_nel) {
  int retval = prepend_rebindings(&_rebindings_head, rebindings, rebindings_nel);
  if (retval < 0) {
    return retval;
  }
  // If this was the first call, register callback for image additions (which is also invoked for
  // existing images, otherwise, just run on existing images
  if (!_rebindings_head->next) {
    _dyld_register_func_for_add_image(_rebind_symbols_for_image);
  } else {
    uint32_t c = _dyld_image_count();
    for (uint32_t i = 0; i < c; i++) {
      const char *path = _dyld_get_image_name(i);
      const char *base = basename((char *)path);

      // Only rebind libraries that are not the special generated sanitizer ones
      if (strcmp(base, "libclang_rt.tsan_iossim_dynamic.dylib") &&
          strcmp(base, "libclang_rt.asan_iossim_dynamic.dylib"))
      {
          _rebind_symbols_for_image(_dyld_get_image_header(i), _dyld_get_image_vmaddr_slide(i));
      }
    }
  }
  return retval;
}

A better fix would be to better identify these special compiler generated DLLs instead of hardcoding the file names.

tirodkar commented 3 years ago

@grp Is there any plan for adding any of the suggested fixes? It would be incredibly advantageous for users trying to debug with sanitizers and fishhook.

tirodkar commented 3 years ago

A dirty hack that seems to work is:

static void _rebind_symbols_for_image(const struct mach_header *header,
                                      intptr_t slide) {
    uint32_t c = _dyld_image_count();
    for (uint32_t i = 0; i < c; i++) {
        // HACK: Get file name of the mach header
        if (_dyld_get_image_header(i) == header) {
            const char *path = _dyld_get_image_name(i);
            const char *base = basename((char *)path);

            // Only rebind libraries that are not the special generated sanitizer ones
            if (strcmp(base, "libclang_rt.tsan_iossim_dynamic.dylib") &&
                strcmp(base, "libclang_rt.asan_iossim_dynamic.dylib"))
            {
                _rebind_symbols_for_image(_dyld_get_image_header(i), _dyld_get_image_vmaddr_slide(i));
            }
        }
    }
}
int rebind_symbols(struct rebinding rebindings[], size_t rebindings_nel) {
  int retval = prepend_rebindings(&_rebindings_head, rebindings, rebindings_nel);
  if (retval < 0) {
    return retval;
  }
  // If this was the first call, register callback for image additions (which is also invoked for
  // existing images, otherwise, just run on existing images
  if (!_rebindings_head->next) {
    _dyld_register_func_for_add_image(_rebind_symbols_for_image);
  } else {
    uint32_t c = _dyld_image_count();
    for (uint32_t i = 0; i < c; i++) {
      const char *path = _dyld_get_image_name(i);
      const char *base = basename((char *)path);

      // Only rebind libraries that are not the special generated sanitizer ones
      if (strcmp(base, "libclang_rt.tsan_iossim_dynamic.dylib") &&
          strcmp(base, "libclang_rt.asan_iossim_dynamic.dylib"))
      {
          _rebind_symbols_for_image(_dyld_get_image_header(i), _dyld_get_image_vmaddr_slide(i));
      }
    }
  }
  return retval;
}

A better fix would be to better identify these special compiler generated DLLs instead of hardcoding the file names.

This causes a compilation issue with basename. Do you have a branch?

LeoNatan commented 3 years ago

@tirodkar For me, the following change worked:

Add

#include <libgen.h>

at the top section, then the following changes:

static void _rebind_symbols_for_image(const struct mach_header *header, intptr_t slide) {
    uint32_t c = _dyld_image_count();
    for (uint32_t i = 0; i < c; i++) {
        // HACK: Get file name of the mach header
        if (_dyld_get_image_header(i) == header) {
            const char *path = _dyld_get_image_name(i);
            const char *base = basename((char *)path);

            // Only rebind libraries that are not the special generated sanitizer ones
            if (strcmp(base, "libclang_rt.tsan_iossim_dynamic.dylib") &&
                strcmp(base, "libclang_rt.asan_iossim_dynamic.dylib"))
            {
                rebind_symbols_for_image(_rebindings_head, _dyld_get_image_header(i), _dyld_get_image_vmaddr_slide(i));
            }

            return;
        }
    }
}

(There was a bug in @dlow-yahoo-inc ‘s code above, but easy to fix.)