rr-debugger / rr

Record and Replay Framework
http://rr-project.org/
Other
9.18k stars 586 forks source link

Assertion failure after stackoverflow. #1985

Open yuyichao opened 7 years ago

yuyichao commented 7 years ago

The following simple program triggers a stack overflow assuming a stack limit is set (default).

void *global;

__attribute__((noinline)) void f(void)
{
    void *buff[1024];
    buff[0] = global;
    global = buff;
    asm volatile ("" :: "r"(buff) : "memory");
    f();
}

int main()
{
    f();
    return 0;
}

During replay of this, if I access $_siginfo right after the segfault happens, rr fails with an assertion error.

Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
0x00007f7596254d70 in _start () from /lib64/ld-linux-x86-64.so.2
(rr) c
Continuing.
warning: Could not load shared library symbols for linux-vdso.so.1.
Do you need "set solib-search-path" or "set sysroot"?

Program received signal SIGSEGV, Segmentation fault.
0x00000000004004d5 in f ()
(rr) p $_siginfo
$1 = {si_signo = 11, si_errno = 0, si_code = 1, _sifields = {_pad = {-1090681984, 32765,
      0 <repeats 26 times>}, _kill = {si_pid = -1090681984, si_uid = 32765}, _timer = {
[FATAL /build/rr-git/src/rr/src/Task.cc:2103:read_bytes_helper() errno: EIO]
 (task 15897 (rec:15865) at time 0)
 -> Assertion `false' failed to hold. Should have read 15 bytes from 0x7ffdbefd8371, but only read 0
      si_tid = -1090681984, Launch gdb with
  gdbsi_overrun = 32765, si_sigval = { '-l' '10000' '-ex' 'target extended-remote :15897' /home/yuyichao/.local/share/rr/latest-trace/mmap_clone_3_segfault
sival_int = 0,
        sival_ptr = 0x0}}, _rt = {si_pid = -1090681984, si_uid = 32765, si_sigval = {
        sival_int = 0, sival_ptr = 0x0}}, _sigchld = {si_pid = -1090681984, si_uid = 32765,
      si_status = 0, si_utime = 0, si_stime = 0}, _sigfault = {si_addr = 0x7ffdbefd8380,
      _addr_lsb = 0, _addr_bnd = {_lower = 0x0, _upper = 0x0}}, _sigpoll = {
      si_band = 140727807738752, si_fd = 0}}}

c.c. @Keno

rocallahan commented 7 years ago

This is kinda low priority because accessing $_siginfo already ruins the rr session even when rr doesn't crash.

rocallahan commented 7 years ago

(because of the way we use a request for siginfo to trigger a DiversionSession, and we don't have a good alternative to that)

yuyichao commented 7 years ago

That's useful to know since I uses $_siginfo a lot when I don't want to use the disassemble and register values to figure out the faulting address.....

Would it be easier to at least document this and maybe add a custom command to do this?

rocallahan commented 7 years ago

That sounds like a good idea although I don't think people who use $_siginfo will discover it easily. Then again, $_siginfo itself is a very obscure feature.

yuyichao commented 7 years ago

That sounds like a good idea although I don't think people who use $_siginfo will discover it easily.

I agree it'll be hard to discover if normal use of $_siginfo does not issue a warning/error. (At least I'll use it if it exists ;-p ) Still worth documenting though (unless it is already?). I've probably messed up a dozen replay session this way without realizing it......

Then again, $_siginfo itself is a very obscure feature.

Any alternative? This still seems to be the simplest way to tell the segfaulting address. I don't know why gdb doesn't print the faulting address just like lldb does by default.

rocallahan commented 7 years ago

I can't see a way to get gdb to do something automatically when a SIGSEGV is reported :-(.