reverie-rs / reverie

trace and intercept linux syscalls.
Other
14 stars 5 forks source link

[RFC] Ptrace Based RPC #46

Open gatoWololo opened 5 years ago

gatoWololo commented 5 years ago

These are my comments for the RFC: ptraced based RPC

While I agree this may be useful, after reading the entire RFC, it seems it has a high complexity and implementation cost. With that in mind, do we really need it? Or how useful RPC functionality actually be?

Some more specific comments:

we also need to pass arguments to the very function. The most straight forward way is to push the arguments passed from the tracer onto tracee's stack

Doesn't the function call ABI usually want the arguments passed through the registers? AMD64 System V ABI

If we restrict the number of arguments we might get away with avoiding passing through the stack. We could try avoiding pointers? It really depends what our use-case is...

because the tracer can run multiple threads even in the same process group

Threads definitely make the page idea difficult to use...

wangbj commented 5 years ago

Doesn't the function call ABI usually want the arguments passed through the registers? True, but we pass the arguments from the tracer, while execute the function in the tracee, the only way is to pass all registers onto (tracee's) stack. If there're more than six registers, they need to be passed onto stack as well.

    if let Some(hello) = get_symbol_address(task.getpid(), "hello") {
        unsafe {
            let args: [u64; 6] = [1,2,3,4,5,6];
            rpc_call(&task, hello.as_ptr() as u64, &args);
        }
    };

I've pushed a half baked solution into branch:

https://github.com/iu-parfunc/systrace/blob/global-process-states/src/rpc_ptrace.rs https://github.com/iu-parfunc/systrace/blob/101931d69c245c62a1dee9ce4584a7f5788555db/trampoline/remote_call.S#L62

I've managed to inject a function call (takes 6 arguments) without segfault the tracee

rrnewton commented 5 years ago

@wangbj, eventually Global-only Reverie will work (#53), and you will start shifting things back down to in-guest execution. When that happens, do you think there will be a place for this ptrace-based calling from tracer->guest? Or will it be subsumed by some other RPC framework?

It seems like there will be particular moments in time where control goes from tracer->guest. I'm thinking for instance of the point just after the "early" syscalls, when the tracer will probably migrate process/thread states to the guest...

wangbj commented 5 years ago

yes, even now we can call tracee function in the tracer, so we can do in_guest_mmap when mmap is trapped by seccomp (by the tracer). maintaining sound global/process/thread states and how to efficiently serialize/deserialize them still remains a challange at this point.

rrnewton commented 5 years ago

What are the challenges exactly? Doing RPC from inside the guest definitely seems like a big challenge (the same Chai is facing).

For serialization can't we just require Serde instances?

Maintaining a coherent view of global/process/thread states should be easy in the global-only version (i.e. a big hashmap of process/thread states), right? But, yes, I can see that it gets challenging when the state needs to migrate. You need to identify an atomic commit point at which a process state (and its thread states) migrate from tracer->guest. I guess the loader-shim code that you run via LD_PRELOAD could itself create a "Done" event that the tracer recognizes as the trigger for migration.

The state is unavailable to new event callbacks while this migration is occurring. If the tool sitting on Reverie is executing the process sequentially, it shouldn't be too much of an imposition because blocking on one event handler is effectively blocking the whole process. However, Reverie ultimately can't assume sequential execution ... nevertheless, if the migration happens right after the tool is dynamically loaded, then that should be before main() anyway...