Closed paulfloyd closed 2 years ago
I can't reproduce this for a 32bit binary running on amd64 kernel.
Further, this isn't related to issue #122
Debugging this a bit more, and I saw the following
amd64 does the same for the first four points, but for the last there is no jump.
ktrace seems to provide interesting information. Running standalone I get (from thread kill onwards)
92188 pth_self_kill CALL thr_kill(0x189a7,SIGTERM)
92188 pth_self_kill RET thr_kill 0
92188 pth_self_kill CALL sigprocmask(SIG_SETMASK,0x2809058c,0xffbfeae8)
92188 pth_self_kill RET sigprocmask 0
92188 pth_self_kill CALL sigaction(SIGTERM,0xffbfeab8,0xffbfeaa0)
92188 pth_self_kill RET sigaction 0
92188 pth_self_kill CALL sigprocmask(SIG_SETMASK,0xffbfeae8,0)
92188 pth_self_kill RET sigprocmask 0
92188 pth_self_kill CALL mmap(0,0x20000,0x3<PROT_READ|PROT_WRITE>,0x1002<MAP_PRIVATE|MAP_ANON>,0xffffffff,0,0)
92188 pth_self_kill RET mmap 673558528/0x2825b000
92188 pth_self_kill CALL exit(0)
92188 pth_self_kill RET nanosleep -1 errno 4 Interrupted system call
92188 pth_self_kill PSIG SIGTERM SIG_DFL code=SI_LWP
For 32on64 I get (deleting a lot of stuff like sigprocmask and thr_self
5899 none-x86-freebsd CALL thr_kill(0x18dbe,SIGTERM)
5899 none-x86-freebsd RET thr_kill 0
5899 none-x86-freebsd PSIG SIGTERM caught handler=0x380d9d10 mask=0x0 code=SI_LWP
5899 none-x86-freebsd CALL mmap(0x60ef000,0x20000,0x3<PROT_READ|PROT_WRITE>,0x1012<MAP_PRIVATE|MAP_FIXED|MAP_ANON>,0xffffffff,0,0)
5899 none-x86-freebsd RET mmap 101642240/0x60ef000
5899 none-x86-freebsd CALL thr_kill(0x18e18,SIG 128)
5899 none-x86-freebsd RET thr_kill 0
5899 none-x86-freebsd RET nanosleep -1 errno 4 Interrupted system call
5899 none-x86-freebsd CALL thr_self(0x4eacd9c)
5899 none-x86-freebsd PSIG SIG -128 caught handler=0x380d9f00 mask=0x0 code=SI_LWP
5899 none-x86-freebsd CALL thr_exit(0x2)
5899 none-x86-freebsd CALL exit(0x2)
And pure x86
92213 none-x86-freebsd CALL thr_kill(0x18710,SIGTERM)
92213 none-x86-freebsd RET thr_kill 0
92213 none-x86-freebsd RET nanosleep -1 errno 4 Interrupted system call
92213 none-x86-freebsd PSIG SIGTERM caught handler=0x38049a50 mask=0x0 code=SI_LWP
92213 none-x86-freebsd PSIG SIGSEGV caught handler=0x3804a440 mask=0xfffef067 code=SEGV_MAPERR
92213 none-x86-freebsd CALL kill(0x16835,SIGSEGV)
92213 none-x86-freebsd RET kill 0
92213 none-x86-freebsd PSIG SIGSEGV SIG_DFL code=SI_USER
If I can summarize that
Standalone
32on64
Pure x86
Some similarities with issue #136
I do see these from ktrace
8166 none-x86-freebsd RET sigtimedwait -1 errno 35 Resource temporarily unavailable
https://stackoverflow.com/questions/17012206/catching-sigchld-using-sigtimedwait-on-bsd
Quick and dirty attempt, but it doesn't seem to fix anything
Int VG_(sigtimedwait_zero)( const vki_sigset_t *set,
vki_siginfo_t *info )
{
/*
static const struct vki_timespec zero = { 0, 0 };
SysRes res = VG_(do_syscall3)(__NR_sigtimedwait, (UWord)set, (UWord)info,
(UWord)&zero);
return sr_isError(res) ? -1 : sr_Res(res);
*/
SysRes res = VG_(do_syscall0)(__NR_kqueue);
int kq = sr_Res(res);
struct kevent ke;
struct timespec zero = { 0, 0 };
EV_SET(&ke, set->sig[0], EVFILT_SIGNAL, EV_ADD, 0, 0, NULL);
VG_(do_syscall6)(__NR_kevent, kq, (UWord)&ke, 1, (UWord)NULL, 0, (UWord)NULL);
res = VG_(do_syscall6)(__NR_kevent, kq, (UWord)NULL, 0, (UWord)&ke, 1, (UWord)&zero);
VG_(do_syscall1)(__NR_close, kq);
return sr_isError(res) ? -1 : sr_Res(res);
}
Also looks good with https://bugs.kde.org/show_bug.cgi?id=445032
On amd64 with --trace-syscalls=yes I see
But on i386 this is
On i386 in gdb if I put a breakpoint on async_signalhandler then the callstack is
This is not easy to debug. I don't see problems when running under gdb (or lldb). Also, 32on64 works OK.
My impressions so far are