Closed grawity closed 5 years ago
\<+jp> the nocache library internally uses pthreads() and "After a fork(2) in a multithreaded process returns in the child, the child should call only async-signal-safe functions (see signal-safety(7)) until such time as it calls execve(2) to execute a new program." so if the program preloaded with nocache forks, now you have every open/close call violating this
I think i'm noticing this too with rsync (used with backintime application)
It would be good to have a minimal example that reliably reproduces the issue.
The comment #2 about pthread vs. forks seems pertinent, and nocache already uses pthread_atfork
to re-initialize mutexes after forking. The man page shows that you can actually register three handlers: one to prepare the fork, and one each that gets executed after the fork for parent and child.
I currently don’t have time to work on nocache, but I’d review PRs. Perhaps it’s enough to add a prepare function pointer that resets the syscall wrapper functions to the _original_*
versions, then resets them in the parent after the fork is done.
Again, once you have a reliable way to reproduce the hang this should be fairly straightforward.
I can always reproduce with running strace someapp
under nocache. On my system these git
subcommands also work (probably due to being external processes and also starting yet more subprocesses, like a pager):
nocache strace /bin/true
nocache git log
nocache git annex
(Doing something like strace nocache git log
shows it hanging while trying to read() from a pipe.)
Okay, I can reproduce your problem now. The program hangs with a loop of these syscalls:
rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
…
I’m a bit out of my depth here, but will take a stab at reading the man pages to understand why this is happening.
I’ve looked at this for a while but haven’t found a good solution. At least in the example I can reproduce, I can delete line 158 which calls the async signal unsafe functions in the destructor of the shared library, and then nocache strace /bin/true
no longer hangs. Can you please test and confirm your problem goes away after commenting out the line?
I can of course introduce an environment variable to make it configurable, but I’d like to understand what exactly is going wrong here. I’ve experimented with various ideas, but to no avail:
sigprocmask(SIG_SET, NULL, &old_mask)
, then compare if it is the full mask (via memcmp
) (which seems to be a thing you can do); don’t do anything fancy in syscall wrappers if all signals are currently blocked.Out of curiosity, where did you get the advice mentioned in comment 2? Could you perhaps point them to this issue to help explain what’s happening?
Can you please test and confirm your problem goes away after commenting out the line?
Unfortunately no.
…I kept deleting stuff until it stopped hanging. This is what's left of the entire nocache.c, and LD_PRELOAD=… strace /bin/true
still hangs with it – judging from the fprintfs, after two or three invocations pthread_atfork()
never returns anymore.
#define _GNU_SOURCE
#include <unistd.h>
#include <stdio.h>
#include <pthread.h>
static void init(void) __attribute__((constructor));
static void init_mutexes(void);
FILE *debugfp;
static void init(void)
{
init_mutexes();
}
static void init_mutexes(void)
{
fprintf(stderr, "%d: <<init_mutexes\n", getpid());
/* make sure to re-initialize mutex if forked */
pthread_atfork(NULL, NULL, init_mutexes);
fprintf(stderr, "%d: atfork set\n", getpid());
fprintf(stderr, "%d: init_mutexes>>\n", getpid());
}
Out of curiosity, where did you get the advice mentioned in comment 2?
@alyptik
@alyptik – could you please elaborate on https://github.com/Feh/nocache/issues/34#issuecomment-410509252? I think I understand the issue, but I don’t understand if there is a way to fix it. I don’t understand enough of the control flow during a fork to figure out where exactly nocache
would need to relinquish control and where to pick it up again. Thanks!
@Feh I was talking about http://patches-tcwg.linaro.org/patch/3340/ and cases like your usage of functions like pthread_mutex_lock
in your init_mutexes
child handler.
From the fork
manpage:
After a fork() in a multithreaded program, the child can safely call only async-signal-safe functions (see signal-safety(7)) until such time as it calls execve(2).
So when you fork
, your init_mutexes
handler is called and all of the non-async-signal-safe functions from that point until an exec are technically undefined behavior.
Now, I don't know if this is the actual issue, but it's the only problem that stuck out I looked through your code (which is both a compliment to you and not very helpful :). As to how to fix it, since you do kind of need that function, and taking into consideration the fact that it's not even clear that that function is the problem, I can't immediately offer any fixes.
I'll look through the code a bit more when I have some spare time and will submit a PR if I end up proving it's the cause and thinking of a sane workaround.
@Feh The final v3 patch that went into glibc 2.28 can be found here https://patchwork.sourceware.org/patch/27689/
The upstream glibc commit which broke nocache is the following:
commit 27761a1042daf01987e7d79636d0c41511c6df3c
Author: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Date: Thu Feb 1 17:57:56 2018 -0200
Refactor atfork handlers
Current implementation (sysdeps/nptl/fork.c) replicates the atfork
handlers list backward to invoke the child handlers after fork/clone
syscall.
The internal atfork handlers is implemented as a single-linked list
so a lock-free algorithm can be used, trading fork mulithread call
performance for some code complexity and dynamic stack allocation
(since the backwards list should not fail).
This patch refactor it to use a dynarary instead of a linked list.
It simplifies the external variables need to be exported and also
the internal atfork handler member definition.
The downside is a serialization of fork call in multithread, since to
operate on the dynarray the internal lock should be used. However
as noted by Florian, it already acquires external locks for malloc
and libio so it is already hitting some lock contention. Besides,
posix_spawn should be faster and more scalable to run external programs
in multithread environments.
Checked on x86_64-linux-gnu.
* nptl/Makefile (routines): Remove unregister-atfork.
* nptl/register-atfork.c (fork_handler_pool): Remove variable.
(fork_handler_alloc): Remove function.
(fork_handlers, fork_handler_init): New variables.
(__fork_lock): Rename to atfork_lock.
(__register_atfork, __unregister_atfork, libc_freeres_fn): Rewrite
to use a dynamic array to add/remove atfork handlers.
* sysdeps/nptl/fork.c (__libc_fork): Likewise.
* sysdeps/nptl/fork.h (__fork_lock, __fork_handlers, __linkin_atfork):
Remove declaration.
(fork_handler): Remove next, refcntr, and need_signal member.
(__run_fork_handler_type): New enum.
(__run_fork_handlers): New prototype.
* sysdeps/nptl/libc-lockP.h (__libc_atfork): Remove declaration.
This:
static void init_mutexes(void)
{
fprintf(stderr, "%d: <<init_mutexes\n", getpid());
/* make sure to re-initialize mutex if forked */
pthread_atfork(NULL, NULL, init_mutexes);
fprintf(stderr, "%d: atfork set\n", getpid());
fprintf(stderr, "%d: init_mutexes>>\n", getpid());
}
eventually calls pthread_atfork
from a fork handler. With the new lock in glibc commit 27761a1042daf01987e7d79636d0c41511c6df3c, that results in a deadlock.
I don't know how common such code is. It has always resulted in a memory leak. If the malloc
implementation uses a fork handler itself and acquires locks around fork, that would likely result in a deadlock, too. So I'm not sure if it is worth fixing this in glibc, considering the fork handler is just buggy as written.
I’m happy to change the implementation in whichever way is best suited to avoid a deadlock, but I can’t find good documentation around the precise semantics of these steps.
It seems to me that if a pthread_atfork
cannot call pthread_atfork
again, then any double-fork would leave mutexes uninitialized (if that’s the thing you use the handler for). In other words, how would you guarantee cleanly initialized mutexes after a fork without this construct?
It has always resulted in a memory leak.
Could you elaborate?
considering the fork handler is just buggy as written.
Could you make a suggestion how to write this in a non-buggy way?
Thank you!
The handlers are not dropped on fork
, so you just keep adding the same handler again on every fork, requiring a new allocation and so on.
It would be sufficient to register the handler just once.
Thanks, that’s good to know.
[
init_mutexes
] eventually callspthread_atfork
from a fork handler. With the new lock in glibc commit 27761a1042daf01987e7d79636d0c41511c6df3c, that results in a deadlock.
However, when I remove all invocations of pthread_atfork
, it still results in a deadlock…
For me the following patch seems to fix the issue, at least it allows the apt show nocache
command to work again. The strace /bin/true
command does not deadlock anymore, but goes into an endless rt_sigprogmask
loop. That said that matches the behaviour of previous glibc versions.
diff --git a/nocache.c b/nocache.c
index 9562aef..009fa3a 100644
--- a/nocache.c
+++ b/nocache.c
@@ -92,6 +92,10 @@ static void init(void)
getrlimit(RLIMIT_NOFILE, &rlim);
max_fds = rlim.rlim_max;
init_mutexes();
+
+ /* make sure to re-initialize mutex if forked */
+ pthread_atfork(NULL, NULL, init_mutexes);
+
fds = malloc(max_fds * sizeof(*fds));
assert(fds != NULL);
@@ -146,8 +150,6 @@ static void init_mutexes(void)
pthread_mutex_init(&fds_lock[i], NULL);
}
pthread_mutex_unlock(&fds_iter_lock);
- /* make sure to re-initialize mutex if forked */
- pthread_atfork(NULL, NULL, init_mutexes);
}
static void init_debugging(void)
You can reproduce this bug on ubuntu 18.10 for example
@aurel32 thanks, in particular because I thought “endless rt_sigprogmask
loop” was the same as the deadlock described before (so even removing the pthread_atfork
call entirely didn’t solve the problem). Do you want to submit a pull request?
@Feh I think you can commit it like that, at the end I just translated the words from @fweimer into a patch. But if you prefer I can submit a PR.
@aurel32 Turns out I had a commit with essentially that lying around already; pushed this one just now. Other people on this issue: Could you test if this fixes the problem? Thnaks!
I can confirm that my test case nocache rsync /home/someusername/somedirectory2/ /home/someusername/somedirectory2/ is fine now 👍
Thanks – I’ve tagged the current HEAD commit with v1.1
to make this easier for distributions to disambiguate when they want to repackage the newest version. (xref Debian bug)
I’m closing this issue as I believe it is fixed – please reopen if this is not the case.
I think i'm noticing this too with rsync (used with backintime application)
Still noticing this. However, per here it seems rsync
hasn't crashed; it's just working much more slowly. I'm doing a full, ~14 GB /
backintime run on my RPi3B+ right now; I'll see how long it takes. If anything, I might have to adjust my backup cadence to weekly instead of daily.
FWIW without nocache
my daily backintime
run crashes by Pi and forces a reboot.
If the backup takes longer than a day (I'm patient and the Pi runs 25/7/52 anyway) I'll probably switch from backintime
to restic
.
With the latest glibc release, nocache appears to completely freeze within certain programs (examples: strace, git-annex). Still looking for the cause.