I recently discovered how easy it is to interpose (intercept) dynamic library in Linux and OS X. It may be a breeze in Windows too—I haven't looked.
The process goes like this:
I will discuss each step in detail.
In this tutorial, we'll be messing with the output of uname
. Here's the normal output on my MacBook:
$ uname -v
Darwin Kernel Version 10.8.0: Tue Jun 7 16:33:36 PDT 2011; root:xnu-1504.15.3~1/RELEASE_I386
What library calls does uname
make? Let's find out by listing all of the undefined symbols contained in the binary:
$ nm -uj "$(which uname)"
___stack_chk_guard
___stderrp
___stdoutp
_compat_mode
_err
_exit
_fputs
_fwrite
_getenv
_getopt
_optind
_putchar
_setlocale
_strncpy
_uname
dyld_stub_binder
Most of these are probably function calls, but _uname
looks like what we're interested in. OS X requires the leading underscore, so the function we really want is uname
.
Let's check the man page for the uname
function:
$ man 3 uname
NAME
uname -- get system identification
LIBRARY
Standard C Library (libc, -lc)
SYNOPSIS
#include <sys/utsname.h>
int
uname(struct utsname *name);
:
:
From this, we now know that the function signature is:
int uname(struct utsname *name);
We'll create a custom uname
function which calls the original, then modifies the results (interpose_uname.c
):
#include <sys/utsname.h> // struct utsname
#include <dlfcn.h> // dlsym(), dlopen() [OS X]
#include <stdio.h> // printf()
#include <stdlib.h> // exit()
#include <string.h> // strcpy()
int uname(struct utsname *name)
{
typedef int (*func_t)(struct utsname *);
static func_t original = NULL;
if(original == NULL)
{
#ifdef __APPLE__
/**
** On OS X, the original library is loaded explicitly and the function is
** queried from within that library. This technique does not work on Linux; it
** results in an infinite recurse.
**/
// grab handle to the original library
void *handle = dlopen("libc.dylib", RTLD_NOW);
// find the original function within that library
original = (func_t)dlsym(handle, __func__);
#else
/**
** Retrieving a pointer to the original function is even easier in Linux. It
** doesn't even require the original library name. Calling dlsym() with the
** flag "RTLD_NEXT" returns the *next* occurrence of the specified name, which
** is the original library call. This does not work on OS X; it fails to find
** the function.
**/
// find the original function
original = (func_t)dlsym(RTLD_NEXT, __func__);
#endif
if(original == NULL)
{
printf("ERROR: Failed to locate original %s() function; exiting\n", __func__);
exit(1);
}
}
// finally call the original uname()
int result = original(name);
// if successful, change some of the results
if(result == 0)
{
// <sys/utsname.h> defines 'struct utsname' as follows:
//
// #define _SYS_NAMELEN 256
//
// struct utsname {
// char sysname[_SYS_NAMELEN]; /* [XSI] Name of OS */
// char nodename[_SYS_NAMELEN]; /* [XSI] Name of this network node */
// char release[_SYS_NAMELEN]; /* [XSI] Release level */
// char version[_SYS_NAMELEN]; /* [XSI] Version level */
// char machine[_SYS_NAMELEN]; /* [XSI] Hardware type */
// };
//
// Let's change the version to "Johnny 5".
//
strncpy(name->version, "Johnny 5", sizeof(name->version));
}
// return the original result, unchanged
return result;
}
Without the comments the code is fairly short and most of it is boiler–plate.
OS X:
$ gcc -shared -fPIC -Wall -Werror -std=c99 -o libinterpose_uname.dylib interpose_uname.c
Linux:
$ gcc -shared -fPIC -Wall -Werror -std=c99 -o libinterpose_uname.so interpose_uname.c
Finally, the fruits of our labor. Let's run uname -v
again, this time with our custom library pre–loaded.
OS X:
$ DYLD_FORCE_FLAT_NAMESPACE=1 DYLD_INSERT_LIBRARIES=libinterpose_uname.dylib uname -v
Johnny 5
Linux:
$ LD_PRELOAD=libinterpose_uname.so uname -v
Johnny 5
In Step 3 I had mentioned that the interposing code is mostly boiler–plate. With that in mind, I wrote a small utility to automatically generate the interposing code from a header. The generated code is C++(11) instead of C, so we can have some more fun with it. Let's duplicate the above example with this new utility:
$ git clone git://github.com/themattrix/interpose.git
$ cd interpose/src
$ export DEST=. # Output generated content to this directory
$ export HEADER=/usr/include/sys/utsname.h # Generate code from this header
$ export API_LIB=/usr/lib/libc.dylib # OS X only: library containing original uname()
$ make interpose-src # Generate code
$ make interpose-lib # Compile code
$ make do-interpose APP='uname -v' # Interpose
=================================================[ Running interposing code ]===
[1338738075.588706][done][0.000056] uname()
Darwin Kernel Version 10.8.0: Tue Jun 7 16:33:36 PDT 2011; root:xnu-1504.15.3~1/RELEASE_I386
By default, the generated code (interpose_usr_utsname.cpp
) timestamps the function calls:
template<typename Function>
auto uname(Function original, struct utsname *arg1) -> int
{
return timestamp(original(arg1));
}
Instead, let's match the functionality of our C code:
template<typename Function>
auto uname(Function original, struct utsname *name) -> int
{
int result(original(name));
if(result == 0)
{
// just so I don't have to say name->version several times
auto &v(name->version);
// copy "Johnny 5" into name->version
v[std::string("Johnny 5").copy(v, sizeof(v))] = '\0';
}
return result;
}
Let's see how we did:
$ make interpose-lib
$ make do-interpose APP='uname -v'
=================================================[ Running interposing code ]===
Johnny 5
Perfect!