Feh / nocache

minimize caching effects
BSD 2-Clause "Simplified" License
554 stars 53 forks source link

Incompatibility with glibc 2.28 #34

Closed grawity closed 5 years ago

grawity commented 6 years ago

With the latest glibc release, nocache appears to completely freeze within certain programs (examples: strace, git-annex). Still looking for the cause.

grawity commented 6 years ago

\<+jp> the nocache library internally uses pthreads() and "After a fork(2) in a multithreaded process returns in the child, the child should call only async-signal-safe functions (see signal-safety(7)) until such time as it calls execve(2) to execute a new program." so if the program preloaded with nocache forks, now you have every open/close call violating this

kokoko3k commented 5 years ago

I think i'm noticing this too with rsync (used with backintime application)

Feh commented 5 years ago

It would be good to have a minimal example that reliably reproduces the issue.

The comment #2 about pthread vs. forks seems pertinent, and nocache already uses pthread_atfork to re-initialize mutexes after forking. The man page shows that you can actually register three handlers: one to prepare the fork, and one each that gets executed after the fork for parent and child.

I currently don’t have time to work on nocache, but I’d review PRs. Perhaps it’s enough to add a prepare function pointer that resets the syscall wrapper functions to the _original_* versions, then resets them in the parent after the fork is done.

Again, once you have a reliable way to reproduce the hang this should be fairly straightforward.

grawity commented 5 years ago

I can always reproduce with running strace someapp under nocache. On my system these git subcommands also work (probably due to being external processes and also starting yet more subprocesses, like a pager):

nocache strace /bin/true
nocache git log
nocache git annex

(Doing something like strace nocache git log shows it hanging while trying to read() from a pipe.)

Feh commented 5 years ago

Okay, I can reproduce your problem now. The program hangs with a loop of these syscalls:

rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
…

I’m a bit out of my depth here, but will take a stab at reading the man pages to understand why this is happening.

Feh commented 5 years ago

I’ve looked at this for a while but haven’t found a good solution. At least in the example I can reproduce, I can delete line 158 which calls the async signal unsafe functions in the destructor of the shared library, and then nocache strace /bin/true no longer hangs. Can you please test and confirm your problem goes away after commenting out the line?

I can of course introduce an environment variable to make it configurable, but I’d like to understand what exactly is going wrong here. I’ve experimented with various ideas, but to no avail:

Out of curiosity, where did you get the advice mentioned in comment 2? Could you perhaps point them to this issue to help explain what’s happening?

grawity commented 5 years ago

Can you please test and confirm your problem goes away after commenting out the line?

Unfortunately no.

…I kept deleting stuff until it stopped hanging. This is what's left of the entire nocache.c, and LD_PRELOAD=… strace /bin/true still hangs with it – judging from the fprintfs, after two or three invocations pthread_atfork() never returns anymore.

#define _GNU_SOURCE
#include <unistd.h>
#include <stdio.h>
#include <pthread.h>

static void init(void) __attribute__((constructor));
static void init_mutexes(void);

FILE *debugfp;

static void init(void)
{
    init_mutexes();
}

static void init_mutexes(void)
{
    fprintf(stderr, "%d: <<init_mutexes\n", getpid());
    /* make sure to re-initialize mutex if forked */
    pthread_atfork(NULL, NULL, init_mutexes);
    fprintf(stderr, "%d: atfork set\n", getpid());
    fprintf(stderr, "%d: init_mutexes>>\n", getpid());
}

Out of curiosity, where did you get the advice mentioned in comment 2?

@alyptik

Feh commented 5 years ago

@alyptik – could you please elaborate on https://github.com/Feh/nocache/issues/34#issuecomment-410509252? I think I understand the issue, but I don’t understand if there is a way to fix it. I don’t understand enough of the control flow during a fork to figure out where exactly nocache would need to relinquish control and where to pick it up again. Thanks!

alyptik commented 5 years ago

@Feh I was talking about http://patches-tcwg.linaro.org/patch/3340/ and cases like your usage of functions like pthread_mutex_lock in your init_mutexes child handler.

From the fork manpage:

After a fork() in a multithreaded program, the child can safely call only async-signal-safe functions (see signal-safety(7)) until such time as it calls execve(2).

So when you fork, your init_mutexes handler is called and all of the non-async-signal-safe functions from that point until an exec are technically undefined behavior.

Now, I don't know if this is the actual issue, but it's the only problem that stuck out I looked through your code (which is both a compliment to you and not very helpful :). As to how to fix it, since you do kind of need that function, and taking into consideration the fact that it's not even clear that that function is the problem, I can't immediately offer any fixes.

I'll look through the code a bit more when I have some spare time and will submit a PR if I end up proving it's the cause and thinking of a sane workaround.

alyptik commented 5 years ago

@Feh The final v3 patch that went into glibc 2.28 can be found here https://patchwork.sourceware.org/patch/27689/

aurel32 commented 5 years ago

The upstream glibc commit which broke nocache is the following:

commit 27761a1042daf01987e7d79636d0c41511c6df3c                                                                                                                                               
Author: Adhemerval Zanella <adhemerval.zanella@linaro.org>                                                                                                                                    
Date:   Thu Feb 1 17:57:56 2018 -0200

    Refactor atfork handlers

    Current implementation (sysdeps/nptl/fork.c) replicates the atfork
    handlers list backward to invoke the child handlers after fork/clone
    syscall.

    The internal atfork handlers is implemented as a single-linked list
    so a lock-free algorithm can be used, trading fork mulithread call
    performance for some code complexity and dynamic stack allocation
    (since the backwards list should not fail).

    This patch refactor it to use a dynarary instead of a linked list.
    It simplifies the external variables need to be exported and also
    the internal atfork handler member definition.

    The downside is a serialization of fork call in multithread, since to
    operate on the dynarray the internal lock should be used.  However
    as noted by Florian, it already acquires external locks for malloc
    and libio so it is already hitting some lock contention.  Besides,
    posix_spawn should be faster and more scalable to run external programs
    in multithread environments.

    Checked on x86_64-linux-gnu.

            * nptl/Makefile (routines): Remove unregister-atfork.
            * nptl/register-atfork.c (fork_handler_pool): Remove variable.
            (fork_handler_alloc): Remove function.
            (fork_handlers, fork_handler_init): New variables.
            (__fork_lock): Rename to atfork_lock.
            (__register_atfork, __unregister_atfork, libc_freeres_fn): Rewrite
            to use a dynamic array to add/remove atfork handlers.
            * sysdeps/nptl/fork.c (__libc_fork): Likewise.
            * sysdeps/nptl/fork.h (__fork_lock, __fork_handlers, __linkin_atfork):
            Remove declaration.
            (fork_handler): Remove next, refcntr, and need_signal member.
            (__run_fork_handler_type): New enum.
            (__run_fork_handlers): New prototype.
            * sysdeps/nptl/libc-lockP.h (__libc_atfork): Remove declaration.
fweimer commented 5 years ago

This:

static void init_mutexes(void)
{
    fprintf(stderr, "%d: <<init_mutexes\n", getpid());
    /* make sure to re-initialize mutex if forked */
    pthread_atfork(NULL, NULL, init_mutexes);
    fprintf(stderr, "%d: atfork set\n", getpid());
    fprintf(stderr, "%d: init_mutexes>>\n", getpid());
}

eventually calls pthread_atfork from a fork handler. With the new lock in glibc commit 27761a1042daf01987e7d79636d0c41511c6df3c, that results in a deadlock.

I don't know how common such code is. It has always resulted in a memory leak. If the malloc implementation uses a fork handler itself and acquires locks around fork, that would likely result in a deadlock, too. So I'm not sure if it is worth fixing this in glibc, considering the fork handler is just buggy as written.

Feh commented 5 years ago

I’m happy to change the implementation in whichever way is best suited to avoid a deadlock, but I can’t find good documentation around the precise semantics of these steps.

It seems to me that if a pthread_atfork cannot call pthread_atfork again, then any double-fork would leave mutexes uninitialized (if that’s the thing you use the handler for). In other words, how would you guarantee cleanly initialized mutexes after a fork without this construct?

It has always resulted in a memory leak.

Could you elaborate?

considering the fork handler is just buggy as written.

Could you make a suggestion how to write this in a non-buggy way?

Thank you!

fweimer commented 5 years ago

The handlers are not dropped on fork, so you just keep adding the same handler again on every fork, requiring a new allocation and so on. It would be sufficient to register the handler just once.

Feh commented 5 years ago

Thanks, that’s good to know.

[init_mutexes] eventually calls pthread_atfork from a fork handler. With the new lock in glibc commit 27761a1042daf01987e7d79636d0c41511c6df3c, that results in a deadlock.

However, when I remove all invocations of pthread_atfork, it still results in a deadlock…

aurel32 commented 5 years ago

For me the following patch seems to fix the issue, at least it allows the apt show nocache command to work again. The strace /bin/true command does not deadlock anymore, but goes into an endless rt_sigprogmask loop. That said that matches the behaviour of previous glibc versions.

diff --git a/nocache.c b/nocache.c
index 9562aef..009fa3a 100644
--- a/nocache.c
+++ b/nocache.c
@@ -92,6 +92,10 @@ static void init(void)
     getrlimit(RLIMIT_NOFILE, &rlim);
     max_fds = rlim.rlim_max;
     init_mutexes();
+
+    /* make sure to re-initialize mutex if forked */
+    pthread_atfork(NULL, NULL, init_mutexes);
+
     fds = malloc(max_fds * sizeof(*fds));
     assert(fds != NULL);

@@ -146,8 +150,6 @@ static void init_mutexes(void)
         pthread_mutex_init(&fds_lock[i], NULL);
     }
     pthread_mutex_unlock(&fds_iter_lock);
-    /* make sure to re-initialize mutex if forked */
-    pthread_atfork(NULL, NULL, init_mutexes);
 }

 static void init_debugging(void)
xlazom00 commented 5 years ago

You can reproduce this bug on ubuntu 18.10 for example

Feh commented 5 years ago

@aurel32 thanks, in particular because I thought “endless rt_sigprogmask loop” was the same as the deadlock described before (so even removing the pthread_atfork call entirely didn’t solve the problem). Do you want to submit a pull request?

aurel32 commented 5 years ago

@Feh I think you can commit it like that, at the end I just translated the words from @fweimer into a patch. But if you prefer I can submit a PR.

Feh commented 5 years ago

@aurel32 Turns out I had a commit with essentially that lying around already; pushed this one just now. Other people on this issue: Could you test if this fixes the problem? Thnaks!

xlazom00 commented 5 years ago

I can confirm that my test case nocache rsync /home/someusername/somedirectory2/ /home/someusername/somedirectory2/ is fine now 👍

Feh commented 5 years ago

Thanks – I’ve tagged the current HEAD commit with v1.1 to make this easier for distributions to disambiguate when they want to repackage the newest version. (xref Debian bug)

I’m closing this issue as I believe it is fixed – please reopen if this is not the case.

jdrch commented 4 years ago

I think i'm noticing this too with rsync (used with backintime application)

Still noticing this. However, per here it seems rsync hasn't crashed; it's just working much more slowly. I'm doing a full, ~14 GB / backintime run on my RPi3B+ right now; I'll see how long it takes. If anything, I might have to adjust my backup cadence to weekly instead of daily.

FWIW without nocache my daily backintime run crashes by Pi and forces a reboot.

If the backup takes longer than a day (I'm patient and the Pi runs 25/7/52 anyway) I'll probably switch from backintime to restic.