reverie-rs / reverie

trace and intercept linux syscalls.
Other
14 stars 5 forks source link

Disconnect ptrace tracer after initial setup #7

Open wangbj opened 5 years ago

wangbj commented 5 years ago

ptrace is slow and very stateful, monitoring data from tracer is both slow and very inconvenient, because tracer/tracee are in difference address space. with seccomp-bpf, it is possible to trigger SIGSYS instead of a ptrace event for our interested syscalls. The ideal use case would be:

rrnewton commented 5 years ago

Since this touches on the signals topic, we should have a discussion somewhere of how/whether ptrace can be avoided for signals.

This in-process systrace mechanism is sufficient to capture SYS_rt_sigaction attempts to register syscall handlers, right? But we would need an instrumentation framework to also register real signal handlers for all the signals coming from the OS / outside world, and instead convert those to enqueued events somewhere inside the virtualized (deterministic) scheduler. Do you foresee any blockers to that?

rrnewton commented 5 years ago

Let me clarify that there's probably no performance reason to not just use ptrace for signal conversion. The only reason I dream of a completely ptrace-free version is so that it could cleanly run with other things that use ptrace (like inside gdb, or with CRIU...).

gatoWololo commented 5 years ago

The general idea sounds good. One concern I have is signals handlers and signals are really tricky, specially with ptrace in the mix. Running our own code inside a signal handler might lead to some issues... For example, what happens if another signal interrupts our handler in the middle of instrumenting. We basically have to deal with all the gross details of signal handlers.

@rrnewton I think we definitely need to spend some time understanding signals. @devietti and I look at signals when trying to implement deterministic signals for DeTtrace and even then, I don't think I understand their super confusing semantics. The ptrace man page has a section on "Signal-delivery-stop" and "Signal injection and suppression" if you want to peek into the rabbit hole.

If the only reason we're using ptrace is to inject our own setup code, would we have more luck using an approach as follows: 1) Force the loader to load our shared object when the program starts, something like this 2) Run code when a shared object is loaded using something like this (Although I remember we had issues with the always running initialization code for libdet.so as part of detflow...

rrnewton commented 5 years ago

@gatoWololo - the LD_BIND_NOW trick doesn't address catching the very first syscalls executed during dynamic loading, does it? (Is that the "80" syscalls referred to by the RR ATC17 paper, or fewer than that?) I don't have any good idea yet of what it would mean to not catch these leading syscalls. Maybe we could make a list of them and go through them.

On signals... I increasingly think that it could be handled above the systrace layer. Systrace provides the building blocks, using captured_syscall to intercept handler-registration, and untraced_syscall to register your own handlers for real signal events.