I have a process which has been behaving oddly wrt. signals, and apparently only when run through umockdev-run. Specifically, my process has two threads. One thread blocks a specific signal right after being spawned, which is not blocked in the parent thread. Then, a moment later, the signal is seemingly unblocked again without me having touched it in that thread. I've written a minimum reproduction and pushed a PR which seems to fix the problem for me, with the caveat that I'm new to the codebase.
Reproduction
Minimal reproduction code is pasted below, built with clang++ main.cpp -o main -std=c++17 -Werror -fno-omit-frame-pointer -O2 -g . Example output, showing a signal being mysteriously unblocked in the spawned thread (the thread ID is prepended to each log output):
$ umockdev-run -- ./main
448903 --- start new thread
448903 --- Write to file
448906 +++ myThread
448906 +++ Signal SIGRTMAX is blocked: 0
448906 +++ Blocking signal
448906 +++ Signal SIGRTMAX is blocked: 1
448906 +++ Write to file
448903 --- Write to file
448906 +++ Signal SIGRTMAX is blocked: 0
448906 +++ Signal is not blocked!
448903 --- Write to file
^[[Aterminate called after throwing an instance of 'std::system_error'
what(): Invalid argument
#include <cstring>
#include <fstream>
#include <iostream>
#include <thread>
#include <signal.h>
#include <unistd.h>
const char *FILE_ONE = "one";
const char *FILE_TWO = "two";
bool writeEntireFile(const std::string &path, const std::string &data)
{
std::ofstream file(path, std::ios::binary);
file << data;
file.close();
return !file.fail();
}
int signal_is_blocked(int sig)
{
sigset_t ss;
int r;
r = pthread_sigmask(SIG_SETMASK, NULL, &ss);
if (r != 0) {
return -r;
}
return sigismember(&ss, sig);
}
void myThread()
{
std::cout << gettid() << " +++ myThread" << std::endl;
int is_blocked = signal_is_blocked(SIGRTMAX);
std::cout << gettid() << " +++ Signal SIGRTMAX is blocked: " << is_blocked << std::endl;
std::cout << gettid() << " +++ Blocking signal" << std::endl;
sigset_t ss{};
int ret = 0;
ret |= sigemptyset(&ss);
ret |= sigaddset(&ss, SIGRTMAX);
if (ret < 0) {
std::cout << gettid() << " +++ Failed to set up sigset: " << strerror(errno) << std::endl;
throw std::system_error(errno, std::generic_category());
}
ret = pthread_sigmask(SIG_BLOCK, &ss, nullptr);
if (ret < 0) {
std::cout << gettid() << " +++ Failed to block signals: " << strerror(errno) << std::endl;
throw std::system_error(errno, std::generic_category());
}
is_blocked = signal_is_blocked(SIGRTMAX);
std::cout << gettid() << " +++ Signal SIGRTMAX is blocked: " << is_blocked << std::endl;
std::cout << gettid() << " +++ Write to file" << std::endl;
if (!writeEntireFile(FILE_TWO, std::to_string(gettid()))) {
std::cout << gettid() << " +++ Failed to write to file" << std::endl;
throw std::system_error(EINVAL, std::generic_category());
}
is_blocked = signal_is_blocked(SIGRTMAX);
std::cout << gettid() << " +++ Signal SIGRTMAX is blocked: " << is_blocked << std::endl;
if (is_blocked != 1) {
std::cout << gettid() << " +++ Signal is not blocked!" << std::endl;
std::this_thread ::sleep_for(std::chrono::seconds(1));
throw std::system_error(EINVAL, std::generic_category());
}
std::cout << gettid() << " +++ Exit thread" << std::endl;
}
int main()
{
std::cout << gettid() << " --- start new thread" << std::endl;
auto t = std::thread{myThread};
for (int i = 0; i < 3; ++i) {
std::cout << gettid() << " --- Write to file" << std::endl;
if (!writeEntireFile(FILE_ONE, std::to_string(gettid()))) {
std::cout << gettid() << " --- Failed to write to file" << std::endl;
throw std::system_error(EINVAL, std::generic_category());
}
}
t.join();
return 0;
}
Analysis
AFAICT, the issue occurrs when both threads try to use one of the calls intercepted by umockdev's preload.so, specifically those which use the TRAP_PATH_LOCK and TRAP_PATH_UNLOCK macros.
My hypothesis is that there's a race condition which causes one thread's signal mask to be stored in trap_path_sig_restore, and then before it can use this to restore the thread's signal mask again, another thread does the same, overwriting trap_path_sig_restore. Then, when the first thread wants to restore its signal mask, the value in trap_path_sig_restore is actually that of the second thread, thus messing up its signal mask and "loosing" a blocked signal.
I traced this using perf, tracing syscalls:sys_enter_rt_sigprocmask, which seems to confirm this behavior.
Note that thread 1489965 stores its current signal mask in the sigset at 0x7f4819eaef00
1489976 then restores the sigset stored at the same address, even though it belonged to the other thread
(you can tell its restored since oset: 0x00000000)
Hia and thanks for this awesome project :D
Context
I have a process which has been behaving oddly wrt. signals, and apparently only when run through
umockdev-run
. Specifically, my process has two threads. One thread blocks a specific signal right after being spawned, which is not blocked in the parent thread. Then, a moment later, the signal is seemingly unblocked again without me having touched it in that thread. I've written a minimum reproduction and pushed a PR which seems to fix the problem for me, with the caveat that I'm new to the codebase.Reproduction
Minimal reproduction code is pasted below, built with
clang++ main.cpp -o main -std=c++17 -Werror -fno-omit-frame-pointer -O2 -g
. Example output, showing a signal being mysteriously unblocked in the spawned thread (the thread ID is prepended to each log output):Analysis
AFAICT, the issue occurrs when both threads try to use one of the calls intercepted by umockdev's preload.so, specifically those which use the
TRAP_PATH_LOCK
andTRAP_PATH_UNLOCK
macros.https://github.com/martinpitt/umockdev/blob/main/src/libumockdev-preload.c#L174C9-L174C23
My hypothesis is that there's a race condition which causes one thread's signal mask to be stored in
trap_path_sig_restore
, and then before it can use this to restore the thread's signal mask again, another thread does the same, overwritingtrap_path_sig_restore
. Then, when the first thread wants to restore its signal mask, the value intrap_path_sig_restore
is actually that of the second thread, thus messing up its signal mask and "loosing" a blocked signal.I traced this using perf, tracing
syscalls:sys_enter_rt_sigprocmask
, which seems to confirm this behavior.1489965
stores its current signal mask in the sigset at0x7f4819eaef00
1489976
then restores the sigset stored at the same address, even though it belonged to the other threadoset: 0x00000000
)