paulfloyd / freebsd_valgrind

Git repo used to Upstream the FreeBSD Port of Valgrind

GNU General Public License v2.0

15 stars 4 forks source link

gdbserver_tests/nlpasssigalrm generates unexpected "sigreturn mc_flags", though test passes #137

Closed nbriggs closed 2 years ago

nbriggs commented 4 years ago

I don't recall it printing these "sigreturn mc_flags 3b" messages in the past. I'm not sure what's generating them.

$ perl tests/vg_regtest gdbserver_tests/nlpasssigalrm
nlpasssigalrm:   valgrind   --tool=none --vgdb=yes --vgdb-error=0 --vgdb-prefix=./vgdb-prefix-nlpasssigalrm ./passsigalrm  (progB: ./gdb --quiet -l 60 --nx ./passsigalrm)
pid 32479 (none-x86-freebsd): sigreturn mc_flags 3b
pid 32479 (none-x86-freebsd): sigreturn mc_flags 3b
pid 32479 (none-x86-freebsd): sigreturn mc_flags 3b

== 1 test, 0 stderr failures, 0 stdout failures, 0 stderrB failures, 0 stdoutB failures, 0 post failures ==

paulfloyd commented 4 years ago

This test was failing until July 12th, so that's why you hadn't seen these messages before.

If I run the test by hand then I see in the Valgrind terminal

pid 1722 (none-amd64-freebsd): sigreturn rflags = 0x4
ok: 1st SIGALRM received
pid 1722 (none-amd64-freebsd): sigreturn rflags = 0x0
ok: 2nd SIGALRM received
pid 1722 (none-amd64-freebsd): sigreturn rflags = 0x44
ok: SIGRTMIN received

(not the same flags or values)

After poking around a bit, it seems as if this message is coming from the kernel,

/usr/src/sys/amd64/amd64/machdep.c and /usr/src/sys/i386/i386/machdep.c

For instance

    if (!EFL_SECURE(rflags, regs->tf_rflags)) {
        uprintf("pid %d (%s): sigreturn rflags = 0x%lx\n", p->p_pid,
            td->td_name, rflags);
        return (EINVAL);
    }

Update: I got the above block of code wrong (mixed up rflags and mc_flags)

What is EFL_SECURE ???

define EFL_SECURE(ef, oef) ((((ef) ^ (oef)) & ~PSL_USERCHANGE) == 0)

These are

define PSL_C 0x00000001 / carry bit /

define PSL_PF 0x00000004 / parity bit /

define PSL_AF 0x00000010 / bcd carry bit /

define PSL_Z 0x00000040 / zero bit /

define PSL_N 0x00000080 / negative bit /

define PSL_T 0x00000100 / trace enable bit /

define PSL_I 0x00000200 / interrupt enable bit /

define PSL_D 0x00000400 / string instruction direction bit /

define PSL_V 0x00000800 / overflow bit /

define PSL_IOPL 0x00003000 / i/o privilege level /

define PSL_NT 0x00004000 / nested task bit /

define PSL_RF 0x00010000 / resume flag bit /

define PSL_VM 0x00020000 / virtual 8086 mode bit /

define PSL_AC 0x00040000 / alignment checking /

define PSL_VIF 0x00080000 / virtual interrupt enable /

define PSL_VIP 0x00100000 / virtual interrupt pending /

define PSL_ID 0x00200000 / identification bit /

The bits that aren't in PSL_USERCHANGE are::

interrupt enable 0x200
io priv 0x3000 (two bits)
virt x86 0x20000
virt int en 0x80000
virt int pend 0x100000

that is PSL_USERCHANGE is sensitive to 12 bits Valgrind only has special handling for 9 bits

0 (C carry) 2 (P parity) 4 (A aux carry) 6 (Z zero) 7 (S sign) 10 (D direction) 11 (O overflow) 18 (AD align/access ctrl) 21 (ID ID)

@todo which bits are in PSL_USERCHANGE but have no specific handling in Valgrind?

So one (or more) of these must have changed between the flags and the old flags (and have been in the old flags)

This is worth looking at in more detail because the testcase was failing because 'build_sigframe' was making a badly formed signal frame. I fixed it half by trial and error and half by looking at the FreeBSD source. The 'mcontext' which contains these flags is part of the signal frame, so there may still be problems.

Update: Looking at the uprintf values for rflags there are rflags = 0x4 - this is parity, either this was turned on or one or more other bits turned off rflags = 0x0 - this means one or more bits must have been turned off rflags = 0x44 - zero and parity

paulfloyd commented 4 years ago

I added some VG_(printfs) to the code that reads rflags on amd64 and saw the same values read from VEX as were generated by the kernel message above.

Also this test in DRD has similar behaviour sigalrm: valgrind ./sigalrm pid 43587 (drd-amd64-freebsd): sigreturn rflags = 0x5

paulfloyd commented 3 years ago

On i386 we have, in sigframe-x86-freebsd.c

struct sigframe {
   /* Sig handler's return address */
   Addr retaddr;

   /*
    * The following 7 members are roughly the same as
    * 'stuct sigframe' in x86/sigrame.h
    */
   Int  sigNo;
   Addr psigInfo;      /* code or pointer to sigContext */
   Addr puContext;     /* points to uContext */
   Addr addr;          /* "secret" 4th argument */
   Addr phandler;      /* "action" or "handler" */

   /* pointed to by puContext */
   struct vki_ucontext uContext;

   vki_siginfo_t sigInfo;

   struct _vki_fpstate fpstate;

   struct vg_sigframe vg;
};

and then

static Addr build_sigframe(ThreadState *tst,
                           Addr esp_top_of_frame,
                           const vki_siginfo_t *siginfo,
                           const struct vki_ucontext *siguc,
                           void *handler, UInt flags,
                           const vki_sigset_t *mask,
                           void *restorer)
{
   struct sigframe *frame;
   Addr esp = esp_top_of_frame;
   Int  sigNo = siginfo->si_signo;
   UWord trapno;
   UWord err;

#if defined(__clang__)
   esp -= 4;
   esp = VG_ROUNDDN(esp, 16);
   esp -= sizeof(*frame) + 4;
#else
   esp -= sizeof(*frame);
   esp = VG_ROUNDDN(esp, 16);
#endif

   frame = (struct sigframe *)esp;

(I think that I arrived at the clang figures more or less by trial and error. My guess is that the result is sometimes correct, but not always.

Comparing this to FreeBSD, the headers have a similar but not identical sigframe

struct sigframe {
        /*
         * The first four members may be used by applications.
         *
         * NOTE: The 4th argument is undocumented, ill commented
         * on and seems to be somewhat BSD "standard".  Handlers
         * installed with sigvec may be using it.
         */
        register_t      sf_signum;
        register_t      sf_siginfo;     /* code or pointer to sf_si */
        register_t      sf_ucontext;    /* points to sf_uc */
        register_t      sf_addr;        /* undocumented 4th arg */

        union {
                __siginfohandler_t      *sf_action;
                __sighandler_t          *sf_handler;
        } sf_ahu;
        ucontext_t      sf_uc;          /* = *sf_ucontext */
        siginfo_t       sf_si;          /* = *sf_siginfo (SA_SIGINFO case) */
};

So roughly, VG siginfo = addr + FreeBSD siginfo + fpstate + vg_sigframe

In the kernel i386 machdep.c sendsig does the following:

    sp -= sizeof(struct sigframe);
    /* Align to 16 bytes. */
    sfp = (struct sigframe *)((unsigned long)sp & ~0xFul);

On amd64, things work better (probably luck).

In Valgrind we have the same struct sigframe (with changed underlying types). build_sigframe() looks like

   rsp -= sizeof(*frame);
   rsp = VG_ROUNDDN(rsp, 16) - 8;
   frame = (struct sigframe *)rsp;

Back over in FreeBSD, struct sigframe is quite different

struct sigframe {
        union {
                __siginfohandler_t      *sf_action;
                __sighandler_t          *sf_handler;
        } sf_ahu;
        ucontext_t      sf_uc;          /* = *sf_ucontext */
        siginfo_t       sf_si;          /* = *sf_siginfo (SA_SIGINFO case) */
};

We have several more fields in Valgrind, after the address. I've noticed that the mcontext part of ucontext has a 16byte aligned array in FreeBSD but mnot Valgrind. Last sendsig`does exactly the same address calculation as i386.

There may be different requirements for sigaltstack (see #151).

The offset in mtrampoline.S VG(x86_freebsd_SUBST_FOR_sigreturn) (22) 0x1c for i386 and 40 (0x28) for amd64 (not sure why one is hex and the other decimal) also probably need to be adjusted if struct sigframe is altered.

UPDATE: I removed some of the unnecessary amd64 struct sigframe fields, and reduced the offset in the trampoline to 0x8. No new regression failures.

paulfloyd commented 2 years ago

It's a while since I looked at this, but it's been in the back of my mind. It's also a 'top 3' outstanding bug (along with lld sections and OMP),

I'm still somewhat assuming that the problem is stack alignment differences between what Valgrind synthesizes and what the FreeBSD kernel is expecting.

One problem with the rflags message is that it doesn't indicate which bit has changed. Finding that out would require either kernel debugging or building a kernel. Looking again at the source for FreeBSD sys_sigreturn, the XOR is being done between rflags in struct thread->strct td_frame->tf_rflags and the rflags in ucontext, I presume that the second of these comes from Valgrind synth_ucontext/LibVEX_GuestAMD64_get_rflags. I can probably add Valgrind printfs to compare what we synthesize and what we see in syswrap sys_sigreturn, which just takes a pointer to ucontext, and the FreeBSD machdep.c sys_sigreturn which takes a pointer to thread and a sigreturn_args which is

struct sigreturn_args { char sigcntxpl[PADL_(const struct ucontext *)]; const struct ucontext sigcntxp; char sigcntxpr[PADR_(const struct __ucontext )]; };

which looks like a pointer to ucontext, but with padding to ensure that there is some sort of alignment. Hmm. What is PADL and PADR? And what if, when synthesizing, we're not getting this padding right.

Update: after an excchange on IRC, the padding is to make sure that the argument has the same alignment as a pointer. In this case both should be 0 so this is not the source of the problem.

I'm also not too sure what does on between the syscall entry for sys_sigreturn

I also think that I need to either look at what the stack contains in the two cases of running under Valgrind and standalone. I see two ways to do that. Either in gdb, by reading $RSP and then examining the memory. Alternatively in the program itself, getting the address of a local variable and then printing the contents of the memory before it in 16 byte aligned chunks.

nbriggs commented 2 years ago

Thanks for following up on this. I've recently returned to using valgrind on FreeBSD against a piece of X11-using code. Are you still taking valgrind/FreeBSD issues in this repo or is there a different place for bug reports now that you're integrating with the main line valgrind releases?

nbriggs commented 2 years ago

I just realized that this is likely the cause of the weird SIGSEGVs I'm seeing when I try to apply valgrind/memcheck to code that has a periodic timer (SIGVTALRM).

paulfloyd commented 2 years ago

For me the best thing is to add new bug reports to the Valgrind bugzilla: https://bugs.kde.org I might continue to track things here for minor issues. I guess that people coming from FreeBSD will continue to use the FreeBSD bugzilla

Also I no longer work on the freebsd branch (this repo). Now I work on the main sourceware master branch git://sourceware.org/git/valgrind.git and occasionally merge back to this branch.

nbriggs commented 2 years ago

Thanks. Works for me. I'll switch to building from the main sourceware branch if I am having trouble with the pre-built binary package.

I just submitted https://bugs.kde.org/show_bug.cgi?id=445032 which I suspect is related to this bug, but seems to be provoked by the combination of libthr.so being associated when code uses a SIGVTALRM handler (even if there's no thread activity whatsoever). It e-mailed a lot of people but I didn't see you on the list.

paulfloyd commented 2 years ago

Back to the sigreturn debug. I just built a custom kernel on 12.2 amd64 to print the other half of the rflags xor that is failing:

pid 822 (none-amd64-freebsd): sigreturn rflags = 0x0 regs->tf_rflags = 0x216 pid 822 (none-amd64-freebsd): sigreturn rflags = 0x0 regs->tf_rflags = 0x216 pid 822 (none-amd64-freebsd): sigreturn rflags = 0x44 regs->tf_rflags = 0x216

(rflags on the left is what is trying to be restored from the ucontext, regs->tf_rflags on the right is what is in the thread frame)

0x200 'interrupt enable' is masked out, leaving 0x16. 0x2 is also masked, and it is expected to be 1 (according to https://wiki.osdev.org/CPU_Registers_x86-64#RFLAGS_Register)

That leaves 'bcd carry' and 'parity' that are trying to be turned off in the first two lines (just 'bcd carry' turned off in the third).

Also in the third 0x40 is being turned on, bit 6 the zero flag.

Debugging with vgdb, on the return from the first interrupt handler in thr_sighandler the context that gets sent to sig_sigreturn contains

(gdb) p uc2 $2 = {uc_sigmask = {bits = {0, 0, 0, 0}}, uc_mcontext = {mc_onstack = 0, mc_rdi = 1823, mc_rsi = 14, mc_rdx = 0, mc_rcx = 77326122, mc_r8 = 34292629576, mc_r9 = 0, mc_rax = 0, mc_rbx = 0, mc_rbp = 34292630400, mc_r10 = 77325738, mc_r11 = 16, mc_r12 = 34292630520, mc_r13 = 0, mc_r14 = 34292630504, mc_r15 = 1, mc_trapno = 0, mc_fs = 0, mc_gs = 0, mc_addr = 0, mc_flags = 0, mc_es = 0, mc_ds = 0, mc_err = 0, mc_rip = 77326122, mc_cs = 0, mc_rflags = 0, mc_rsp = 34292630328, mc_ss = 0, mc_len = 800, mc_fpformat = 65536, mc_ownedfp = 131072, mc_fpstate = {75666440, 34292627160, 34292627136, 67152842, 75640840, 75641864, 1063, 75665416, 76180221, 121387641, 226653584, 75637560, 30064771073, 75666440, 76097752, 34292627256, 2433, 76151332, 34292626800, 21474836485, 34292627080, 34292627256, 2113, 75666440, 0, 75637560, 34292627232, 67151720, 30064772135, 76180221, 121387641, 226653584, 75637560, 1, 88698944, 1, 88699348, 0, 34292627184, 75963612, 1, 67179413, 34292627232, 34292627504, 76099312, 76179888, 2178, 76150312, 34292627336, 75666440, 3, 34292627504, 34292627472, 67159717, 34292627504, 34292627456, 2098998, 1024, 67233816, 75666440, 895, 76099312, 0, 0}, mc_fsbase = 0, mc_gsbase = 0, mc_xfpustate = 0, mc_xfpustate_len = 0, mc_spare = {0, 0, 0, 0}}, uc_link = 0x0, uc_stack = {ss_sp = 0xdeadbeef, ss_size = 0, ss_flags = 4}, uc_flags = 0, spare__ = {0, 0, 0, 0}}

0 rflags there

And outside of Valgrind

(gdb) p uc2 $1 = {uc_sigmask = {bits = {0, 0, 0, 0}}, uc_mcontext = {mc_onstack = 0, mc_rdi = 1846, mc_rsi = 14, mc_rdx = 0, mc_rcx = 34363585962, mc_r8 = 0, mc_r9 = 0, mc_rax = 0, mc_rbx = 0, mc_rbp = 140737488347936, mc_r10 = 0, mc_r11 = 514, mc_r12 = 140737488348056, mc_r13 = 0, mc_r14 = 140737488348040, mc_r15 = 1, mc_trapno = 12, mc_fs = 19, mc_gs = 27, mc_addr = 34362262000, mc_flags = 1, mc_es = 59, mc_ds = 59, mc_err = 2, mc_rip = 34363586346, mc_cs = 67, mc_rflags = 518, mc_rsp = 140737488347864, mc_ss = 59, mc_len = 800, mc_fpformat = 65538, mc_ownedfp = 131073, mc_fpstate = {895, 0, 0, 281470681751424, 0 <repeats 60 times>}, mc_fsbase = 34362093856, mc_gsbase = 0, mc_xfpustate = 0, mc_xfpustate_len = 0, mc_spare = {0, 0, 0, 0}}, uc_link = 0x0, uc_stack = {ss_sp = 0x0, ss_size = 0, ss_flags = 4}, uc_flags = 0, spare__ = {0, 0, 0, 0}}

And 0x206 rflags there. Looks more like the 0x216 that the kernel prints out.

I added a VG_(printf) to synth_ucontext and the result is (FreeBSD 12.2 in a VM

DEBUG: synth_ucontext rflags 0 pid 1459 (none-amd64-freebsd): sigreturn rflags = 0x0 regs->tf_rflags = 0x216 ok: 1st SIGALRM received DEBUG: synth_ucontext rflags 0 pid 1459 (none-amd64-freebsd): sigreturn rflags = 0x0 regs->tf_rflags = 0x216 ok: 2nd SIGALRM received sh: ../tests/true: not found DEBUG: synth_ucontext rflags 68 pid 1459 (none-amd64-freebsd): sigreturn rflags = 0x44 regs->tf_rflags = 0x216 ok: SIGRTMIN received

What does this mean? Well at least things are consistent and the rflags values aren't junk, which was one thing thingt I considered possible.

Next?

For bit 0x200 the VEX amd64 model doesn't include this. Perhaps it is always set in signal handlers in which case I could fudge it by setting it in synth_sigreturn.

Otherwise I think that the problem is that the kernel is comparing rflags synthesized by Valgrind for the guest with rflags in the thread data for Valgrind.

Is there a way to hack them to make them the same?

paulfloyd commented 2 years ago

When I change the kernel (12.2 amd64) so that sigreturn ignores the rflags difference

I get another kernel message

pid 1092 (none-amd64-freebsd): sigreturn cs = 0x0

and then a Valgrind crash. The crash is to be expected as it is a consequence of the test that results in the above message.

cs = ucp->uc_mcontext.mc_cs;
if (!CS_SECURE(cs)) {
    uprintf("pid %d (%s): sigreturn cs = 0x%x\n", p->p_pid,
        td->td_name, cs);
    ksiginfo_init_trap(&ksi);
    ksi.ksi_signo = SIGBUS;
    ksi.ksi_code = BUS_OBJERR;
    ksi.ksi_trapno = T_PROTFLT;
    ksi.ksi_addr = (void *)regs->tf_rip;
    trapsignal(td, &ksi);
    return (EINVAL);
}

Valgrind puts nothing in cs so it is zero, whereas the above test ensures either bit 0 or 1 is set.

paulfloyd commented 2 years ago

Getting back to the i386 failure

It's checking that the bits other than _MC_FLAG_MASK are not set.

if ((ucp->uc_mcontext.mc_flags & ~_MC_FLAG_MASK) != 0) {
    uprintf("pid %d (%s): sigreturn mc_flags %x\n", p->p_pid,
        td->td_name, ucp->uc_mcontext.mc_flags);
    return (EINVAL);
}

define _MC_HASSEGS 0x1

define _MC_HASBASES 0x2

define _MC_HASFPXSTATE 0x4

../../x86/include/ucontext.h:#define _MC_FLAG_MASK (_MC_HASSEGS | _MC_HASBASES | _MC_HASFPXSTATE)

3b in the error message is 111011 binary but only the lower 3 bits can be 1s.

This looks wrong though. I see 0 in flags for synth_ucontext, so where is this 0x35 coming from?

paulfloyd commented 2 years ago

paulf@freebsd:~/freebsd_valgrind $ perl tests/vg_regtest none/tests/thread-exits thread-exits: valgrind ./thread-exits pid 96078 (none-x86-freebsd): sigreturn ucp ptr 0x1d567920 mc_flags 50a07

== 1 test, 0 stderr failures, 0 stdout failures, 0 stderrB failures, 0 stdoutB failures, 0 post failures ==

This is with everything build using GCC though.

With clang this gives pid 96157 (none-x86-freebsd): sigreturn eflags = 0x0 and it passes

paulfloyd commented 2 years ago

Back on 12.2 i386, the original testcase for this item still gives me

starting ... pid 27919 (none-x86-freebsd): sigreturn mc_flags 3b ok: 1st SIGALRM received pid 27919 (none-x86-freebsd): sigreturn mc_flags 3b ok: 2nd SIGALRM received pid 27919 (none-x86-freebsd): sigreturn mc_flags 3b ok: SIGRTMIN received

I'm fairly sure that in this case sys_sigreturn is getting the wrong address for the ucontext and hence reading junk.

This is what I believe ought to be happening:

Outside of valgrind

kill() (or an async event like a bus error) causes the FreeBSD signal handler to create a ucontext and call the client signal handler
the client signal handler runs and returns
The FreeBSD signal handler call sigreturn with the ucontext to resume at the call site

There are different signal handlers for threadless and pthread exes.

Under Valgrind

Valgrind intercepts kill() or synthesises a signal like sigbus
Valgrind synthesises the ucontext, this is phoney but needs to be good enough for the next steps
the client signal handler runs under Valgrind and returns
Still under Valgrind the FreeBSD signal handler calls sigreturn with our phoney ucontext
I'm a bit hazy here, but I think that we need sigreturn to fail cleanly here, because we're giving it the client RIP and we don't want the interrupt to resume running client code, it needs to resume running the Valgrind code.

Last thing for this morning, I did a bit of debugging on drd/tests/sigalrm, which I've probably debuggged before. This crashes somewhere in libthr.so whilst doing a pthread_join and it looks like the valgrind code is trying to do a write syscall but the stack is corrupt.

(gdb) p vgPlain_get_and_pp_StackTrace(1, 10) ==54317== at 0x71F8ACB: ??? (in /lib/libthr.so.3) ==54317== by 0x71EBE83: ??? (in /lib/libthr.so.3) ==54317== by 0x71F4814: ??? (in /lib/libthr.so.3) ==54317== by 0x71F44DE: pthread_join (in /lib/libthr.so.3) ==54317== by 0x401E2C: main (sigalrm.c:85) $5 = void (gdb)

I'll look later to see if I can match the hex address to the code in libthr.so

paulfloyd commented 2 years ago

Looking again on Linux, the return from a signal handler function is pretty simple.

mov 0xf, %rax
syscall

And this is pretty much the same on FreeBSD with libthr is not linked.

The kernel source is here

https://elixir.bootlin.com/linux/latest/source/arch/x86/kernel/signal.c

SYSCALL_DEFINE0(rt_sigreturn)
{
    struct pt_regs *regs = current_pt_regs();
    struct rt_sigframe __user *frame;
    sigset_t set;
    unsigned long uc_flags;

    frame = (struct rt_sigframe __user *)(regs->sp - sizeof(long));
    if (!access_ok(frame, sizeof(*frame)))
        goto badframe;
    if (__get_user(*(__u64 *)&set, (__u64 __user *)&frame->uc.uc_sigmask))
        goto badframe;
    if (__get_user(uc_flags, &frame->uc.uc_flags))
        goto badframe;

    set_current_blocked(&set);

    if (restore_sigcontext(regs, &frame->uc.uc_mcontext, uc_flags))
        goto badframe;

    if (restore_altstack(&frame->uc.uc_stack))
        goto badframe;

    return regs->ax;

badframe:
    signal_fault(regs, frame, "rt_sigreturn");
    return 0;
}

So that

checks that the memory containing the frame is OK
copies _uc_sigmask and uc_sigframe from user memory
blocks signals
restores registers from uc_mcontext anf flags
restores altstack (not sure what this does)

A big difference between Linux and FreeBSD is that Linux doesn't check so much. As long as it can read the stuff from user memory it will just mask or set the flags.

FreeBSD: if (!EFL_SECURE(rflags, regs->tf_rflags)) {

Linux: regs->flags = (regs->flags & ~FIX_EFLAGS) | (sc.flags & FIX_EFLAGS);

and again FreeBSD: if (!CS_SECURE(cs)) {

Linux: regs->cs = sc.cs | 0x03;

So on FreeBSD this all needs to be ship shape and Bristol fashion when done in build_sigframe whilst on Linux almost anything will do.

paulfloyd commented 2 years ago

Looking at the history of amd64 machdep, the mc_flags check was added in 2009. cs and rflags checks have been there since before Valgrind was ported to FreeBSD (about 2005), originating with the initial amd64 port.

nbriggs commented 2 years ago

In the i386 machdep.c the current code is

fbdb0b778af5 (John Baldwin 2014-11-25 12:52:00 983)       if ((ucp->uc_mcontext.mc_flags & ~_MC_FLAG_MASK) != 0) {
fbdb0b778af5 (John Baldwin 2014-11-25 12:52:00 984)               uprintf("pid %d (%s): sigreturn mc_flags %x\n", p->p_pid,
fbdb0b778af5 (John Baldwin 2014-11-25 12:52:00 985)                   td->td_name, ucp->uc_mcontext.mc_flags);
fbdb0b778af5 (John Baldwin 2014-11-25 12:52:00 986)               return (EINVAL);
fbdb0b778af5 (John Baldwin 2014-11-25 12:52:00 987)       }

and prior to that, there was no check at all of the mc_flags. The log entry associated with the change was "MFamd64: Check for invalid flags in the machine context in sigreturn() and setcontext()."

paulfloyd commented 2 years ago

At long last I believe that I fully understand what is happening.

We have two problems.

In non-threaded apps, when a signal is raised and there is a signal handler, there will be a switch directly to the signal handler and the return from the signal handler is to the trampoline call that calls syscall sigreturn. Under valgrind this all does the same except our trampoline calls syscall fake_sigreturn.

In threaded apps, the thread 'signal' function inserts 'thr_sighandler' before the user sighandler. So outside of valgrind the sequence should be

signal raised -> pth_sighandler -> user sighandler -> return to pth_sighandler -> call sigreturn

The first problem is that pth_sighandler makes a copy of the ucontext. Under valgrind, the ucp pointer passed to sighandler points to an extended structure that has extra data after the ucontext part. This gets lost by the copy.

The second problem is that we never want to call sigreturn. This is for transferring execution on the CPU from one place to another. That's not what we want to do. We want to transfer execution of the VEX CPU from one place to another. So we always want to call fake_sigreturn. Currently under Valgrind the pthread flow is

signal raised -> pth_sighandler -> user sighandler -> return to pth_sighandler -> call sigreturn -> sigreturn fails -> annoying kernel message -> return to Valgrind trampoline -> syscall fake_sigreturn -> back to VEX CPU

I see 3 possible avenues

Do nothing and live with the messages
redirect thr_sighandler. I'm starting to look at that, I hope that it doesn't need to access static file objects.
redirect _thr_sigaction (and _thr_signal_init). That's more work.

Update on point 2: This is not looking too good.

thr_sighandler does the following

saves errno
gets the current thread (using the private stuct pthread rather then the public pthread_t) could replace that with pthread_self()
gets a 'usigaction' (private struct) using __libc_sigaction_slot(int sig)
read locks the sigactio lock and copies the sigaction, unlocks
restores errno
clears the 'deferred_run' field of the pthread
checks if the thread is in a critical section && if the signal is deferred, if so updates the current thread and returns

Basically loads of libthr internals.

Next problem. pth_sighandler is not exported. So it can't be redirected.

So at the moment we're stuck with living with the messages.

paulfloyd commented 2 years ago

I may get back to this one day, but for the moment I see no fix that doesn't involve replacing a big chunk of libthr signal handling.