wangbj commented 5 years ago

step to reproduce:

make sure SYS_rt_sigaction is filtered in src/bpf.c.

apply below patch:


--- a/examples/echotool/src/det/dispatch.rs
+++ b/examples/echotool/src/det/dispatch.rs
@@ -2,16 +2,31 @@ use crate::det::ffi::*;
use crate::io::*;
use crate::syscall::*;

+use nix::sys::signal::*; +use libc; +

[no_mangle]

pub extern "C" fn captured_syscall(

_no: i32,
_a0: i64,
_a1: i64,
_a2: i64,
_a3: i64,
_a4: i64,
_a5: i64,
mut _no: i32,
mut _a0: i64,
mut _a1: i64,
mut _a2: i64,
mut _a3: i64,
mut _a4: i64,
mut _a5: i64, ) -> i64 {
if _no == SYS_rt_sigaction as i32 {
let signo = Signal::from_c_int(_a0 as libc::c_int);
if _a1 != 0 {
let sigaction = unsafe { std::slice::from_raw_parts(
_a1 as *mut u64, 3)
};
let mask = unsafe { std::slice::from_raw_parts(
(_a1 + 24) as *mut u64, 4)
};
raw_println!("[echotool] signo: {:?}, {:x?} {:x?}", signo, sigaction, mask);
}
} let ret = untraced_syscall(_no, _a0, _a1, _a2, _a3, _a4, _a5); ret }
test with docker with ubuntu-16.04 (or run make run-docker in top-level of sys trace), and compare the result with ubuntu-18.04.
Expected result:

both system should show the similar result
Actual result:

Result from ubuntu-16.04 seems wrong, arguments for SYS_rt_sigaction doesn't look sane.

wangbj commented 5 years ago

on 18.04:

[echotool] 10495 calling SYS_rt_sigaction
[echotool] signo: Ok(SIGCHLD), [0, 14000000, 7ffff71bbf20, 0] []
[echotool] 10495 calling SYS_rt_sigaction
[echotool] signo: Ok(SIGCHLD), [0, 14000000, 7ffff71bbf20, 0] []

On 16.04:

[echotool] 463 calling SYS_rt_sigaction
[echotool] signo: Ok(SIGCHLD), [0, 7ffff799e4b5, 7ffff71dc4b0, d] [root@876142ebc7fb:/systrace

seg fault due to decoding si.si_mask@0xd (4th argument of kernel_sigaction).

1st argument is signal handler, 2nd is flags, 0x14000000 = SA_RESTORER | SA_RESTART, 0x7fff71bbf20 is restorer, the last one is mask of type sigset_t.

The values from ubuntu 16.04 (glibc-2.23) are obviously trashed, for some unknown reason.

wangbj commented 5 years ago

Actually I started to think this may have something to do with glibc, though I haven't find anything suspicious enough even after check changes between glibc-2.23 & glibc-2.27. Wrote a simple test by doing rt_sigaction directly, it doesn't have this wired behavior:

On 18.04 (glibc-2.27):

$ ./target/debug/systrace tests/signal1
[echotool] 28808 calling SYS_rt_sigaction
[echotool] signo: Ok(SIGALRM), [400740, 14000000, 400760, 0]
[echotool] 28808 calling SYS_rt_sigaction
[echotool] signo: Ok(SIGALRM), []
[echotool] 28808 calling SYS_alarm
[echotool] 28808 calling SYS_fstat
[echotool] 28808 calling SYS_write
[OK] received signal 14
[echotool] 28808 calling SYS_rt_sigreturn

On 16.04 (glibc-2.23):

# ./target/debug/systrace ./tests/signal1
[echotool] 24 calling SYS_rt_sigaction
[echotool] signo: Ok(SIGALRM), [400770, 14000000, 400790, 0]
[echotool] 24 calling SYS_rt_sigaction
[echotool] signo: Ok(SIGALRM), []
[echotool] 24 calling SYS_alarm
[echotool] 24 calling SYS_fstat
[echotool] 24 calling SYS_write
[OK] received signal 14
[echotool] 24 calling SYS_rt_sigreturn

The tests can be found in master branch, or commit 954dac1

wangbj commented 5 years ago

I think I have found the reason why. GCC/CLANG used -mred-zone for x86_64, as an optimization especially for leaf functions, just like __libc_sigaction ing libc. As a result, it allocates less bytes than it supposed to, by using the extra 128-byte red-zone. Our trampoline doesn't know the red-zone could be used by leaf functions, because we changed syscall => callq xxxx, function like __libc_sigaction is no longer a leaf function, hence the stack allocation done by our trampoline clobbered the red zone, which is used by __libc_sigaction, for local variables like kact and koact.

We could do extra 128-byte stack allocation in our trampoline, to avoid spilling red-zone, or we could also switch stack (rsp), but the latter would be harder to implement, because we need to keep track of per-thread stack pointer.

Even we after we allocate 128-byte extra stack space, the callq xxxx instruction inserted by us could still pushed the return address into red-zone, but change this requires to get rid of using call (0xe8) as trampoline completely, and I'm yet to be convinced this is absolute necessary.

After all of that, this issue has something to do with #19

reverie-rs / reverie

different result on ubuntu 16.04 for intercepted `sigaction` #20

[no_mangle]