geofft / redhook

Dynamic function call interposition / hooking (LD_PRELOAD) for Rust
BSD 2-Clause "Simplified" License
176 stars 18 forks source link

Support hooking functions with varargs #4

Open caspark opened 8 years ago

caspark commented 8 years ago

I'm trying to hook! printf but the hook! macro doesn't allow it:

src/lib.rs:25:45: 25:48 error: expected ident, found ...
src/lib.rs:25     unsafe fn printf(format: *const c_char, ...) -> c_int => custom_printf {

However, the compiler is perfectly happy with:

extern {
    fn printf(format: *const c_char, ...) -> c_int;
 }

so perhaps it's just that hook!'s macro definition needs to be updated to support varargs? But I suppose that real! might need to be updated to support calling varargs functions too.

caspark commented 8 years ago

(Oops, first cut of my first comment had incorrect extern fn - corrected now.)

geofft commented 8 years ago

Interesting. I poked a bit at this yesterday and the main issue is that there's no way in Rust to define an extern "C" fn that uses varargs. (You can bind a varargs C-ABI function that someone else wrote, but you can't write a varargs C-ABI function in Rust.) See e.g. this Reddit thread.

So even if the macro were fixed to support varargs, you can't actually write something like

unsafe fn printf(format: *const c_char, ...) -> c_int => my_printf {
    let cstring = CString::new(... something with format ...);
    real!(printf)(format, ...)
}

because there's no way to work with the va_args and to pass it on to the actual printf function.

If you don't care about the arguments, on most architectures (I believe this is true of all architectures Rust supports, and it might even be required by the C standard) the calling convention for varargs and fixed-args function is the same for all pre-varargs arguments. So you could do something like

unsafe fn printf(format: *const c_char) -> c_int => my_printf {
    println!("Would call printf with format string {}", CStr::from_ptr(format).to_string_lossy());
}

but that doesn't seem terribly useful without the other arguments....

Unfortunately there doesn't seem to be a better answer. I'll leave this open in case Rust ever gets the ability to emit varargs C-ABI functions, but I suspect that will never be part of the core language. Perhaps there will be a third-party crate to do hacky things with pointers (no worse than what <stdarg.h> does, I guess), in which case I could use that.

caspark commented 8 years ago

Yeah, I also looked into this a bit more last night; I played around with introducing hook_var! to redhook, which is identical save for that it has , ... as in the extern definition; as you pointed out, not being able to receive the va_list is a bit of a blocker.

The one part that might make this easier is that I don't actually need access to the varargs; I just want to pass them on to the real function using real!(). It seems that they are sometimes passed through when using real!(printf)(format), but unfortunately only sometimes, and I'm guessing "exactly when" is largely determined by some implementation detail of stdarg that I'm unexpectedly violating.

I did find https://github.com/thepowersgang/va_list-rs/ (haven't tried it yet) but it has an open issue about its safety which I don't quite understand (thepowersgang/va_list-rs#3) and I'm not quite sure whether it'd work for printf (as opposed to vprintf).

geofft commented 8 years ago

It seems that they are sometimes passed through when using real!(printf)(format), but unfortunately only sometimes, and I'm guessing "exactly when" is largely determined by some implementation detail of stdarg that I'm unexpectedly violating.

Yes, what's going on is that the first few arguments are in registers, so as long as your hook function happens not to do anything complicated, the registers won't get overwritten. Unfortunately the next few arguments are on the stack, so those just won't get preserved at all.

(I believe that for all Rust architectures, the calling convention for varargs is the same as that of regular functions, with the notable exception of Darwin on AArch64, i.e. 64-bit iPhones and iPads, where varargs are always passed on the stack, even if there are registers available. On basically all UNIXes on amd64, the first six arguments go in registers, and the rest on the stack.)

Actually, now that I think about it, you can't implement C varargs (in either C or Rust) without support from the compiler. The macros in <stdarg.h> require compiler support to figure out if there were arguments passed in registers, so they can be grabbed and copied into the structure before anything else uses that register. thepowersgang/va_list-rs appears to just support working with the va_list structure, which is how you turn varargs in C into a usable data structure, but it requires a C helper to generate that va_list in the first place.

It's worth noting that in C, the way you'd actually hook printf is something like

// cc -fPIC -shared -o preload.so preload.c
#include <stdarg.h>
#include <stdlib.h>
#include <string.h>

int printf(char *format, ...) {
        va_list args;
        char *myformat = malloc(4 + strlen(format));
        strcpy(myformat, "test");
        strcat(myformat, format);
        va_start(args, format);
        vprintf(myformat, args);
        va_end(args);
        free(myformat);
}

that is, you don't actually chain to printf (there isn't even C syntax for that, to my knowledge), you turn it into a va_list structure and pass that to a different function, vprintf, that accepts a va_list. This is the same sort of thing that the C wrapper in va_list-rs does.

(Most reasonable C APIs that take varargs also have a v-version of the function, precisely so that you can write wrappers and similar things. Unfortunately that doesn't directly help us write a preload/intercept library, because the dynamic call is to printf, not vprintf.)

Anyway, I assume that this is out of curiosity; if you have a real-world application where you need to hook printf, write it in C, possibly calling into a static Rust library for the bulk of its work. :(

caspark commented 8 years ago

Thanks for the detailed response; writing in C and calling into Rust is unfortunate but still a lot better than trying to do string manipulation in C! I better go play with va_list to see how the "compile this C code and smush it together with this Rust code" stuff works.

gaul commented 4 years ago

Revisiting this in 2020, it seems that Rust has more support for varargs. Specifically I want to hook functions like open whose third argument is variadic. With the following program:

#![feature(c_variadic)]
hook! {
    unsafe extern "C" fn f(n: usize, mut args: ...) {
    }
}

I get the following error:

error: no rules expected the token `extern`
  --> src/lib.rs:3:12
   |
3  |     unsafe extern "C" fn f(n: usize, mut args: ...) {
   |            ^^^^^^ no rules expected this token in macro call

Any suggestions on how redhook can support extern functions?

gaul commented 4 years ago

Huh, in my case I want to override open and it appears that overriding three argument version on x86-64 correctly passes the variadic argument:

unsafe fn open(pathname: *const c_char, flags: c_int, mode: mode_t) -> c_int => my_open

Obviously this is unsafe and relies on quirks of the calling convention.

anholt commented 3 years ago

Even if we can't generally do varargs, it could be useful to get some documentation of how to handle some common libc functions like this -- I found I had to do this for open and fcntl for a shim I was writing.

thedracle commented 5 months ago

Has anyone figured out any workaround or ways to deal with varargs?