cloudius-systems / osv

OSv, a new operating system for the cloud.
osv.io
Other
4.05k stars 603 forks source link

Blocking signals may prevent execution of corresponding signal handler #1277

Open wkozaczuk opened 8 months ago

wkozaczuk commented 8 months ago

This behavior was observed when testing the tst-sigwait.cc with the Linux dynamic linker.

When this portion of the test runs:

void handler2(int sig) { siguser2_received = 1;}

int main(int ac, char** av)
{
...
    // Not in set, should receive this one
    std::thread thread2([] { kill(0, SIGUSR2); });
    while (siguser2_received == 0);
    thread2.join();

the underlying glibc implementation of pthread_create() invoked when creating thread2 would block all signals before calling the clone syscall and then unblock all of them using the rt_sigprocmask (see this). Depending on how thread2 and the parent thread get scheduled, the SIGUSR2 may get delivered between blocking and unblocking.

This causes the problem on OSv, because its sigprocmask would add the parent thread to the list of waiters for the signal SIGUSR2. And then when the kill(0, SIGUSR2) executes, it would actually try to wake the parent thread fruitlessly because it does not wait for a signal. To make matters worse, because of this logic in kill()below, the handler2() never gets executed and the test hangs forever waiting for the condition:

int wake_up_signal_waiters(int signo)
{
    SCOPE_LOCK(waiters_mutex);
    int woken = 0;

    unsigned sigidx = signo - 1;
    for (auto& t: waiters[sigidx]) {
        woken++;
        t->remote_thread_local_var<int>(thread_pending_signal) = signo;
        t->wake();
    }
    return woken;
}

int kill(pid_t pid, int sig)
{
...
        if ((pid == OSV_PID) || (pid == 0) || (pid == -1)) {
            // This semantically means signaling everybody. So we will signal
            // every thread that is waiting for this.
            //
            // The thread does not expect the signal handler to still be delivered,
            // so if we wake up some folks (usually just the one waiter), we should
            // not continue processing.
            if (wake_up_signal_waiters(sig)) {
                return 0;
            }
        }

        // User-defined signal handler. Run it in a new thread. This isn't
        // very Unix-like behavior, but if we assume that the program doesn't
        // care which of its threads handle the signal - why not just create
        // a completely new thread and run it there...
        // The newly created thread is tagged as an application one
        // to make sure that user provided signal handler code has access to all
        // the features like syscall stack which matters for Golang apps
        const auto sa = signal_actions[sigidx];
        auto t = sched::thread::make([=] {
            if (sa.sa_flags & SA_RESETHAND) {
                signal_actions[sigidx].sa_flags = 0;
                signal_actions[sigidx].sa_handler = SIG_DFL;
            }

BTW this test works fine on Linux.

I think the problem is, that the original implementation of sigwait() (see ee0e61834432def11b46631416f715a2eef2ae0b), assumed that thread blocking a signal would eventually call sigwait() to get the signal received but as the test running with glibc pthread implementation illustrates, blocking temporarily SIGUSR2 by __pthread_create_2_1 adds parent thread which then gets woken and the handler does not get executed.

I am not clear on what exactly this comment means:

            // The thread does not expect the signal handler to still be delivered,
            // so if we wake up some folks (usually just the one waiter), we should
            // not continue processing.
            if (wake_up_signal_waiters(sig)) {
                return 0;
            }

but maybe we should not stop processing if we know nobody was really woken. Maybe the waiters should be really potential_consumers or something like that. Then sigwait() should add the current thread to the list of real waiters and wake_up_signal_waiters() should be tweaked to count only consumers that were really sigwait()-ing, no?

I may be all wrong about it.

Adding relevant OSv strace output fragment:

...
0x0000400000d85040 /lib/x86_64-lin  0         0.150525815 syscall_rt_sigaction 0 <= 10 0x0000200000601470 0x0000200000601510 8
0x0000400000d85040 /lib/x86_64-lin  0         0.150526553 syscall_rt_sigaction 0 <= 12 0x0000200000601470 0x0000200000601510 8
0x0000400000d85040 /lib/x86_64-lin  0         0.150529633 syscall_rt_sigprocmask 0 <= 0 0x0000200000601740 0x00002000006017c0 8
0x0000400000d85040 /lib/x86_64-lin  0         0.150538411 syscall_rt_sigaction 0 <= 33 0x0000200000601400 0x0000000000000000 8
0x0000400000d85040 /lib/x86_64-lin  0         0.150538967 syscall_rt_sigprocmask 0 <= 1 0x0000200000601628 0x0000000000000000 8
0x0000400000d85040 /lib/x86_64-lin  0         0.150554733 syscall_long_mmap    0x20007f62c000 <= 0x0 1052672 0 131106 -1 0
0x0000400000d85040 /lib/x86_64-lin  0         0.150558881 syscall_mprotect     0 <= 0x20007f62d000 1048576 3
0x0000400000d85040 /lib/x86_64-lin  0         0.151848290 syscall_rt_sigprocmask 0 <= 0 0x000020007f9d29e0 0x0000200000601620 8
0x00004000011a4040 >/lib/x86_64-li  0         0.151875952 syscall_sys_set_robust_list 0 <= 0x000020007f72c920 24
0x00004000011a4040 >/lib/x86_64-li  0         0.151878863 syscall_rt_sigprocmask 0 <= 2 0x000020007f72cf30 0x0000000000000000 8
0x00004000011a4040 >/lib/x86_64-li  0         0.151880552 syscall_kill         0 <= 0 10
0x00004000011a4040 >/lib/x86_64-li  0         0.151881276 syscall_rt_sigprocmask 0 <= 0 0x0000200000601740 0x0000000000000000 8
0x00004000011a4040 >/lib/x86_64-li  0         0.151889036 syscall_long_mmap    0x20007fc00000 <= 0x0 134217728 0 16418 -1 0
0x00004000011a4040 >/lib/x86_64-li  0         0.151892878 syscall_munmap       0 <= 0x20007fc00000 4194304
0x00004000011a4040 >/lib/x86_64-li  0         0.151894757 syscall_munmap       0 <= 0x200084000000 62914560
0x00004000011a4040 >/lib/x86_64-li  0         0.151895876 syscall_mprotect     0 <= 0x200080000000 135168 3
0x0000400000d85040 /lib/x86_64-lin  0         0.151911535 syscall_sys_clone3   27 <= 0x00002000006014c0 88
0x0000400000d85040 /lib/x86_64-lin  0         0.151915640 syscall_rt_sigprocmask 0 <= 2 0x0000200000601620 0x0000000000000000 8
0x00004000011a4040 >/lib/x86_64-li  0         0.153334256 syscall_rt_sigprocmask 0 <= 0 0x000020007f72cf30 0x0000000000000000 8
0x0000400000d85040 /lib/x86_64-lin  0         0.153345681 syscall_rt_sigtimedwait 10 <= 0x0000200000601740 0x0000200000601660 0x0000000000000000 8
0x00004000011a4040 >/lib/x86_64-li  0         0.153354951 syscall_madvise      0 <= 0x20007f62c000 1028096 4
0x0000400000d85040 /lib/x86_64-lin  0         0.153380073 syscall_rt_sigprocmask 0 <= 0 0x000020007f9d29e0 0x0000200000601620 8
0x00004000011db040 >/lib/x86_64-li  0         0.153412277 syscall_sys_set_robust_list 0 <= 0x000020007f72c920 24
0x00004000011db040 >/lib/x86_64-li  0         0.153415870 syscall_rt_sigprocmask 0 <= 2 0x000020007f72cf30 0x0000000000000000 8
0x00004000011db040 >/lib/x86_64-li  0         0.153417092 syscall_kill         0 <= 0 10
0x00004000011db040 >/lib/x86_64-li  0         0.153423258 syscall_rt_sigprocmask 0 <= 0 0x000020007f72cf30 0x0000000000000000 8
0x00004000011db040 >/lib/x86_64-li  0         0.153425587 syscall_madvise      0 <= 0x20007f62c000 1028096 4
0x0000400000d85040 /lib/x86_64-lin  0         0.153431464 syscall_sys_clone3   28 <= 0x00002000006014c0 88
0x0000400000d85040 /lib/x86_64-lin  0         0.153436399 syscall_rt_sigprocmask 0 <= 2 0x0000200000601620 0x0000000000000000 8
0x0000400000d85040 /lib/x86_64-lin  0         0.153437191 syscall_rt_sigtimedwait 10 <= 0x0000200000601740 0x0000200000601660 0x0000000000000000 8
0x0000400000d85040 /lib/x86_64-lin  0         0.153443093 syscall_rt_sigprocmask 0 <= 0 0x000020007f9d29e0 0x0000200000601620 8
0x00004000011f7040 >/lib/x86_64-li  0         0.153819651 syscall_sys_set_robust_list 0 <= 0x000020007f72c920 24
0x00004000011f7040 >/lib/x86_64-li  0         0.153823044 syscall_rt_sigprocmask 0 <= 2 0x000020007f72cf30 0x0000000000000000 8
0x00004000011f7040 >/lib/x86_64-li  0         0.153823830 syscall_kill         0 <= 0 12
0x00004000011f7040 >/lib/x86_64-li  0         0.153829063 syscall_rt_sigprocmask 0 <= 0 0x000020007f72cf30 0x0000000000000000 8
0x00004000011f7040 >/lib/x86_64-li  0         0.153831318 syscall_madvise      0 <= 0x20007f62c000 1028096 4
0x0000400000d85040 /lib/x86_64-lin  0         0.153836961 syscall_sys_clone3   29 <= 0x00002000006014c0 88
0x0000400000d85040 /lib/x86_64-lin  0         0.153841732 syscall_rt_sigprocmask 0 <= 2 0x0000200000601620