Perl / perl5

đŸȘ The Perl programming language
https://dev.perl.org/perl5/
Other
1.91k stars 542 forks source link

Lock ordering issue; deadlock in malloc()/Perl_atfork_lock() #13687

Open p5pRT opened 10 years ago

p5pRT commented 10 years ago

Migrated from rt.perl.org#121490 (status was 'open')

Searchable as RT121490$

p5pRT commented 10 years ago

From prumpf@gmail.com

Created by prumpf@gmail.com

This is a bug report for perl from prumpf@​gmail.com\, generated with the help of perlbug 1.40 running under perl 5.19.11.

-----------------------------------------------------------------

Hi! I've run into a deadlock situation with the current git versions of perl (5.19.11) and glibc (2.19)\, on x86_64-pc-linux-gnu with ithreads and MY_MALLOC\, though I've run into it with other setups (recent Debian versions of Perl and glibc\, no MY_MALLOC) as well. I believe I've been able to track down the issue and come up with a workaround\, although I've not yet found the time to come up with a small reproducible test case. Please feel free to ask me for one if it's absolutely required\, though\, or ask for other information\, and I'll do my best.

In summary\, the problem is inconsistent lock ordering between Perl's PL_malloc_mutex and glibc's malloc/arena.c's list_lock. The situation arises when one thread tries to fork() at the same time that another thread calls malloc().

Perl runs pthread_atfork before the first malloc() makes glibc install its atfork handlers\, so fork() calls ptmalloc_lock_all() first\, then Perl_atfork_lock(). That means locking glibc's list_lock first\, then PL_malloc_mutex. (pthread_atfork() has LIFO semantics)

However\, Perl's malloc implementation locks PL_malloc_mutex first\, then (sometimes) runs out of memory and calls the real malloc()\, which tries to lock list_lock. We thus have a race condition and a deadlock\, which I've seen in practice.

I believe this is fundamentally a glibc bug​: its implementation of pthread_atfork() behaves erratically depending on whether malloc() is first called before or after pthread_atfork(). However\, since the broken versions of glibc are out there and multiplying\, we should also work around the issue in Perl itself.

The workaround should be as easy as including an extra PerlMem_free(PerlMem_malloc(1024)) call before calling PTHREAD_ATFORK\, but gcc has started "optimizing" such (otherwise) useless calls. I've found a deliberately duplicate call to perl_alloc() works\, but that's both a one-time memory leak and horribly ugly\, and most likely breaks whatever code uses PL_do_undump.

Nevertheless\, I'll include it here\, because most of the work was probably in tracking down the bug\, and fixing it should be easier\, even if I cannot presently think of a good fix.

diff --git a/ext/ExtUtils-Miniperl/lib/ExtUtils/Miniperl.pm b/ext/ExtUtils-Miniperl/lib/ExtUtils/Miniperl.pm index 730c565..a8092bf 100644 --- a/ext/ExtUtils-Miniperl/lib/ExtUtils/Miniperl.pm +++ b/ext/ExtUtils-Miniperl/lib/ExtUtils/Miniperl.pm @​@​ -129\,6 +129\,19 @​@​ main(int argc\, char **argv\, char **env)   * call PTHREAD_ATFORK() explicitly\, but if and only if it hasn't   * been called at least once before in the current process.   * --GSAR 2001-07-20 */ + /* There's a nasty race condition with the current versions of Perl and + * glibc​: the call to PTHREAD_ATFORK in Perl's main() might be reached + * before the first malloc happens\, in which + * case fork() locks malloc/arena.c's list_lock first\, then tries to lock + * PL_malloc_lock; another thread might have locked PL_malloc_lock first\, + * then tries to lock list_lock\, resulting in a deadlock. + * + * A proper fix would be in glibc\, ensuring that ptmalloc_init() is called + * earlier\, but a workaround is to make a malloc call ourselves. */ + /* This leaks memory\, but works. */ + (void)perl_alloc(); + /* This doesn't leak memory\, but is optimized away by gcc */ + PerlMem_free(PerlMem_malloc(1024));   PTHREAD_ATFORK(Perl_atfork_lock\,   Perl_atfork_unlock\,   Perl_atfork_unlock);

Perl Info ``` Flags: category=core severity=medium Site configuration information for perl 5.19.11: Configured by pip at Sat Mar 22 10:40:51 UTC 2014. Summary of my perl5 (revision 5 version 19 subversion 11) configuration: Derived from: b51c3e77dbb7e510319342a73163b3fbb59baf5a Platform: osname=linux, osvers=3.12-1-amd64, archname=x86_64-linux-thread-multi uname='linux philadelphia 3.12-1-amd64 #1 smp debian 3.12.8-1 (2014-01-19) x86_64 gnulinux ' config_args='-er' hint=previous, useposix=true, d_sigaction=define useithreads=define, usemultiplicity=define use64bitint=define, use64bitall=define, uselongdouble=undef usemymalloc=y, bincompat5005=undef Compiler: cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64', optimize='-O2 -g', cppflags='-D_REENTRANT -D_GNU_SOURCE -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_REENTRANT -D_GNU_SOURCE -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -D_REENTRANT -D_GNU_SOURCE -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -DPERL_POISON -D_REENTRANT -D_GNU_SOURCE -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -D_REENTRANT -D_GNU_SOURCE -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -D_REENTRANT -D_GNU_SOURCE -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -D_REENTRANT -D_GNU_SOURCE -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -D_REENTRANT -D_GNU_SOURCE -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64' ccversion='', gccversion='4.8.2', gccosandvers='' intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16 ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=8, prototype=define Linker and Libraries: ld='cc', ldflags =' -fstack-protector -L/usr/local/lib' libpth=/usr/local/lib /usr/lib/gcc/x86_64-linux-gnu/4.8/include-fixed /usr/include/x86_64-linux-gnu /usr/lib /lib/x86_64-linux-gnu /lib/../lib /usr/lib/x86_64-linux-gnu /usr/lib/../lib /lib /usr/local/lib /usr/lib/gcc/x86_64-linux-gnu/4.8/include-fixed /usr/include/x86_64-linux-gnu /usr/lib libs=-lnsl -ldl -lm -lcrypt -lutil -lpthread -lc perllibs=-lnsl -ldl -lm -lcrypt -lutil -lpthread -lc libc=libc-2.17.so, so=so, useshrplib=false, libperl=libperl.a gnulibc_version='2.18' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E' cccdlflags='-fPIC', lddlflags='-shared -O2 -L/usr/local/lib -fstack-protector' Locally applied patches: uncommitted-changes @INC for perl 5.19.11: /usr/local/lib/perl5/site_perl/5.19.10/x86_64-linux-thread-multi /usr/local/lib/perl5/site_perl/5.19.10 /usr/local/lib/perl5/5.19.10/x86_64-linux-thread-multi /usr/local/lib/perl5/5.19.10 . Environment for perl 5.19.11: HOME=/home/pip LANG=en_US.UTF-8 LANGUAGE (unset) LC_ALL=en_US.utf8 LC_CTYPE= LD_LIBRARY_PATH (unset) LOGDIR (unset) PATH=/bin:/usr/bin:/sbin:/usr/sbin:/usr/local/bin:/usr/local/sbin PERL_BADLANG (unset) SHELL=/bin/zsh-beta ```
p5pRT commented 10 years ago

From prumpf@gmail.com

perl-deadlock-workaround.diff ```diff diff --git a/ext/ExtUtils-Miniperl/lib/ExtUtils/Miniperl.pm b/ext/ExtUtils-Miniperl/lib/ExtUtils/Miniperl.pm index 730c565..a8092bf 100644 --- a/ext/ExtUtils-Miniperl/lib/ExtUtils/Miniperl.pm +++ b/ext/ExtUtils-Miniperl/lib/ExtUtils/Miniperl.pm @@ -129,6 +129,19 @@ main(int argc, char **argv, char **env) * call PTHREAD_ATFORK() explicitly, but if and only if it hasn't * been called at least once before in the current process. * --GSAR 2001-07-20 */ + /* There's a nasty race condition with the current versions of Perl and + * glibc: the call to PTHREAD_ATFORK in Perl's main() might be reached + * before the first malloc happens, in which + * case fork() locks malloc/arena.c's list_lock first, then tries to lock + * PL_malloc_lock; another thread might have locked PL_malloc_lock first, + * then tries to lock list_lock, resulting in a deadlock. + * + * A proper fix would be in glibc, ensuring that ptmalloc_init() is called + * earlier, but a workaround is to make a malloc call ourselves. */ + /* This leaks memory, but works. */ + (void)perl_alloc(); + /* This doesn't leak memory, but is optimized away by gcc */ + PerlMem_free(PerlMem_malloc(1024)); PTHREAD_ATFORK(Perl_atfork_lock, Perl_atfork_unlock, Perl_atfork_unlock); ```
p5pRT commented 10 years ago

From @tonycoz

On Sat Mar 22 09​:53​:21 2014\, prumpf@​gmail.com wrote​:

Hi! I've run into a deadlock situation with the current git versions of perl (5.19.11) and glibc (2.19)\, on x86_64-pc-linux-gnu with ithreads and MY_MALLOC\, though I've run into it with other setups (recent Debian versions of Perl and glibc\, no MY_MALLOC) as well. I believe I've been able to track down the issue and come up with a workaround\, although I've not yet found the time to come up with a small reproducible test case. Please feel free to ask me for one if it's absolutely required\, though\, or ask for other information\, and I'll do my best.

Have you reported the glibc part of the problem to your vendor (Debian?)

Since this seems to be a glibc specific issue\, I wonder if there's a glibc specific way of forcing initialization.

In any case\, the workaround would need to be protected by #ifdef __GLIBC__

https://bugzilla.redhat.com/show_bug.cgi?id=906468

seems like a different but related issue\, unfortunately his post to the glibc mailing list​:

https://sourceware.org/ml/libc-alpha/2013-01/msg01051.html

seems to have been ignored.

Tony

p5pRT commented 10 years ago

The RT System itself - Status changed from 'new' to 'open'

p5pRT commented 10 years ago

From @Leont

On Sat\, Mar 22\, 2014 at 5​:53 PM\, Philipp Rumpf \perlbug\-followup@​perl\.orgwrote​:

Hi! I've run into a deadlock situation with the current git versions of perl (5.19.11) and glibc (2.19)\, on x86_64-pc-linux-gnu with ithreads and MY_MALLOC\, though I've run into it with other setups (recent Debian versions of Perl and glibc\, no MY_MALLOC) as well. I believe I've been able to track down the issue and come up with a workaround\, although I've not yet found the time to come up with a small reproducible test case. Please feel free to ask me for one if it's absolutely required\, though\, or ask for other information\, and I'll do my best.

In summary\, the problem is inconsistent lock ordering between Perl's PL_malloc_mutex and glibc's malloc/arena.c's list_lock. The situation arises when one thread tries to fork() at the same time that another thread calls malloc().

Perl runs pthread_atfork before the first malloc() makes glibc install its atfork handlers\, so fork() calls ptmalloc_lock_all() first\, then Perl_atfork_lock(). That means locking glibc's list_lock first\, then PL_malloc_mutex. (pthread_atfork() has LIFO semantics)

However\, Perl's malloc implementation locks PL_malloc_mutex first\, then (sometimes) runs out of memory and calls the real malloc()\, which tries to lock list_lock. We thus have a race condition and a deadlock\, which I've seen in practice.

I believe this is fundamentally a glibc bug​: its implementation of pthread_atfork() behaves erratically depending on whether malloc() is first called before or after pthread_atfork(). However\, since the broken versions of glibc are out there and multiplying\, we should also work around the issue in Perl itself.

The workaround should be as easy as including an extra PerlMem_free(PerlMem_malloc(1024)) call before calling PTHREAD_ATFORK\, but gcc has started "optimizing" such (otherwise) useless calls. I've found a deliberately duplicate call to perl_alloc() works\, but that's both a one-time memory leak and horribly ugly\, and most likely breaks whatever code uses PL_do_undump.

Nevertheless\, I'll include it here\, because most of the work was probably in tracking down the bug\, and fixing it should be easier\, even if I cannot presently think of a good fix.

This doesn't make sense. Perl's malloc should only use the system's malloc if both USE_PERL_SBRK and PERL_SBRK_VIA_MALLOC are set\, which is not that likely. I'm not sure what's going on here exactly.

Leon

p5pRT commented 10 years ago

From @Leont

On Sat\, Mar 29\, 2014 at 3​:46 PM\, Philipp Rumpf \prumpf@​gmail\.com wrote​:

Hello\, I tried responding via the perlbug system\, but that appears to be broken. Thank you for your responses so far!

As a reminder\, the bug is specific to glibc/nptl-based systems with ithreads\, such as x86_64-pc-linux-gnu.

I've reported the issue on the glibc bugzilla after verifying it's not Debian-specific.

Here's a much simpler fix/workaround\, to metaconfig\, that we can use until fixed glibcs start appearing​:

--------------------------------------- diff --git a/U/threads/d_pthread_atfork.U b/U/threads/d_pthread_atfork.U index 77a8b43..9f0332a 100644 --- a/U/threads/d_pthread_atfork.U +++ b/U/threads/d_pthread_atfork.U @​@​ -5\,7 +5\,7 @​@​ ?RCS​: You may distribute under the terms of either the GNU General Public ?RCS​: License or the Artistic License\, as specified in the README file. ?RCS​: -?MAKE​:d_pthread_atfork​: Inlibc cat Compile usethreads Setvar +?MAKE​:d_pthread_atfork​: Inlibc cat Compile usethreads Setvar d_gnulibc ?MAKE​: -pick add $@​ %\< ?S​:d_pthread_atfork​: ?S​: This variable conditionally defines the HAS_PTHREAD_ATFORK symbol\, @​@​ -37\,6 +37\,12 @​@​ if eval $compile; then else val="$undef" fi +case "$d_gnulibc" in +*) + echo "Assuming pthread_atfork is broken\, since this is glibc." + val="$undef" + ;; +esac case "$usethreads" in $define) case "$val" in -------------------------------------------

And here's a test case for reproducing the bug (Leon was right to point out that without -DPURIFY\, which I had set but forgotten about\, it's not Perl's malloc that calls the real malloc()\, but S_more_refcounted_fds. However\, it's the same bug).\

Yet I don't think pretending that at_fork is helpful at all. That will only create new deadlocks.

This program should terminate (and would probably exhaust file descriptors without a breakpoint)\, but by merely setting the right breakpoint and attempting to continue once it's hit\, we can get it to deadlock (after opening a mere 16 file descriptors).

------------------------------------------ #!/usr/bin/perl # set a breakpoint in S_more_refcounted_fds before running this

use threads;

async { my @​fh;

for \(my $i = 0; ; $i\+\+\) \{
open\($fh\[$i\]\, "\</dev/zero"\);
\}

};

sleep(1); fork(); --------------------------------------

To force the deadlock\, set a breakpoint in S_more_refcounted_fds\, then wait for a while (for the sleep(1) to finish) before continuing after the breakpoint is hit for the second time (the first time will be before the second thread is spawned).

As you can see in this rather long GDB transcript\, the bug is what I described​: thread 2 is trying to malloc() with perlio_mutex held\, thread 1 is trying to fork\, is already holding glibc's malloc mutex\, and is waiting on perlio_mutex.

Yes that makes sense. I guess that means your original proposed solution (calling malloc early if necessary) is warranted.

Leon

p5pRT commented 10 years ago

From prumpf@gmail.com

Thanks for your response!

On Wed\, Mar 26\, 2014 at 6​:47 AM\, Tony Cook via RT \<perlbug-followup@​perl.org

wrote​:

Have you reported the glibc part of the problem to your vendor (Debian?)

I confirmed the problem is present in the git version of glibc and reported it there​: https://sourceware.org/bugzilla/show_bug.cgi?id=16742 I'll file a bug against the Debian package if I don't hear from them.

Since this seems to be a glibc specific issue\, I wonder if there's a glibc specific way of forcing initialization.

In any case\, the workaround would need to be protected by #ifdef __GLIBC__

How about simply forcing HAS_PTHREAD_ATFORK to undef if __GLIBC__ is defined? That should be a little cleaner than the malloc workaround\, at least.

Ideally\, there would be a test case to determine at configuration time whether our pthread_atfork() is broken. However\, that's a little unpredictable\, even with appropriate sleep() statements\, since our system might be too busy.

Here's what I've come up with\, as a patch against metaconfig​:

Inline Patch ```diff diff --git a/U/threads/d_pthread_atfork.U b/U/threads/d_pthread_atfork.U index 77a8b43..9f0332a 100644 --- a/U/threads/d_pthread_atfork.U +++ b/U/threads/d_pthread_atfork.U @@ -5,7 +5,7 @@ ?RCS: You may distribute under the terms of either the GNU General Public ?RCS: License or the Artistic License, as specified in the README file. ?RCS: -?MAKE:d_pthread_atfork: Inlibc cat Compile usethreads Setvar +?MAKE:d_pthread_atfork: Inlibc cat Compile usethreads Setvar d_gnulibc ?MAKE: -pick add $@ %< ?S:d_pthread_atfork: ?S: This variable conditionally defines the HAS_PTHREAD_ATFORK symbol, @@ -37,6 +37,12 @@ if eval $compile; then else val="$undef" fi +case "$d_gnulibc" in +*) + echo "Assuming pthread_atfork is broken, since this is glibc." + val="$undef" + ;; +esac case "$usethreads" in $define) case "$val" in ```

https://bugzilla.redhat.com/show_bug.cgi?id=906468

seems like a different but related issue\, unfortunately his post to the glibc mailing list​:

https://sourceware.org/ml/libc-alpha/2013-01/msg01051.html

seems to have been ignored.

I don't fully understand that report; it sounds like malloc_atfork() shouldn't be performing I/O\, but looking at the source it appears not to be. I suspect that the original bug might have involved pthread_atfork handlers running in the wrong order\, though; maybe fork() should call _IO_list_lock() before calling ptmalloc_lock_all()?

Anyway\, I think that's a different issue\, though it's a pity if it hasn't been fixed.

p5pRT commented 10 years ago

From prumpf@gmail.com

metaconfig-broken-pthread_atfork.diff ```diff diff --git a/U/threads/d_pthread_atfork.U b/U/threads/d_pthread_atfork.U index 77a8b43..9f0332a 100644 --- a/U/threads/d_pthread_atfork.U +++ b/U/threads/d_pthread_atfork.U @@ -5,7 +5,7 @@ ?RCS: You may distribute under the terms of either the GNU General Public ?RCS: License or the Artistic License, as specified in the README file. ?RCS: -?MAKE:d_pthread_atfork: Inlibc cat Compile usethreads Setvar +?MAKE:d_pthread_atfork: Inlibc cat Compile usethreads Setvar d_gnulibc ?MAKE: -pick add $@ %< ?S:d_pthread_atfork: ?S: This variable conditionally defines the HAS_PTHREAD_ATFORK symbol, @@ -37,6 +37,12 @@ if eval $compile; then else val="$undef" fi +case "$d_gnulibc" in +*) + echo "Assuming pthread_atfork is broken, since this is glibc." + val="$undef" + ;; +esac case "$usethreads" in $define) case "$val" in ```
p5pRT commented 10 years ago

From prumpf@gmail.com

Sorry\, I hadn't noticed that PURIFY was still set in my configuration\, which does indeed set PERL_SBRK and PERL_SBRK_VIA_MALLOC. My guess is the issue doesn't appear for MYMALLOC && !PURIFY\, but it's still valid (and I'm pretty sure "what's going on here" is what I've described) for !MYMALLOC and MYMALLOC && PURIFY.

Hope that helps you make sense of it\, and sorry for the confusion.

On Wed\, Mar 26\, 2014 at 10​:51 AM\, Leon Timmermans via RT \< perlbug-followup@​perl.org> wrote​:

On Sat\, Mar 22\, 2014 at 5​:53 PM\, Philipp Rumpf \<perlbug-followup@​perl.org

wrote​:

Hi! I've run into a deadlock situation with the current git versions of perl (5.19.11) and glibc (2.19)\, on x86_64-pc-linux-gnu with ithreads and MY_MALLOC\, though I've run into it with other setups (recent Debian versions of Perl and glibc\, no MY_MALLOC) as well. I believe I've been able to track down the issue and come up with a workaround\, although I've not yet found the time to come up with a small reproducible test case. Please feel free to ask me for one if it's absolutely required\, though\, or ask for other information\, and I'll do my best.

In summary\, the problem is inconsistent lock ordering between Perl's PL_malloc_mutex and glibc's malloc/arena.c's list_lock. The situation arises when one thread tries to fork() at the same time that another thread calls malloc().

Perl runs pthread_atfork before the first malloc() makes glibc install its atfork handlers\, so fork() calls ptmalloc_lock_all() first\, then Perl_atfork_lock(). That means locking glibc's list_lock first\, then PL_malloc_mutex. (pthread_atfork() has LIFO semantics)

However\, Perl's malloc implementation locks PL_malloc_mutex first\, then (sometimes) runs out of memory and calls the real malloc()\, which tries to lock list_lock. We thus have a race condition and a deadlock\, which I've seen in practice.

I believe this is fundamentally a glibc bug​: its implementation of pthread_atfork() behaves erratically depending on whether malloc() is first called before or after pthread_atfork(). However\, since the broken versions of glibc are out there and multiplying\, we should also work around the issue in Perl itself.

The workaround should be as easy as including an extra PerlMem_free(PerlMem_malloc(1024)) call before calling PTHREAD_ATFORK\, but gcc has started "optimizing" such (otherwise) useless calls. I've found a deliberately duplicate call to perl_alloc() works\, but that's both a one-time memory leak and horribly ugly\, and most likely breaks whatever code uses PL_do_undump.

Nevertheless\, I'll include it here\, because most of the work was probably in tracking down the bug\, and fixing it should be easier\, even if I cannot presently think of a good fix.

This doesn't make sense. Perl's malloc should only use the system's malloc if both USE_PERL_SBRK and PERL_SBRK_VIA_MALLOC are set\, which is not that likely. I'm not sure what's going on here exactly.

Leon

p5pRT commented 10 years ago

From prumpf@gmail.com

Hello\, I tried responding via the perlbug system\, but that appears to be broken. Thank you for your responses so far!

As a reminder\, the bug is specific to glibc/nptl-based systems with ithreads\, such as x86_64-pc-linux-gnu.

I've reported the issue on the glibc bugzilla after verifying it's not Debian-specific.

Here's a much simpler fix/workaround\, to metaconfig\, that we can use until fixed glibcs start appearing​:


Inline Patch ```diff diff --git a/U/threads/d_pthread_atfork.U b/U/threads/d_pthread_atfork.U index 77a8b43..9f0332a 100644 --- a/U/threads/d_pthread_atfork.U +++ b/U/threads/d_pthread_atfork.U @@ -5,7 +5,7 @@ ?RCS: You may distribute under the terms of either the GNU General Public ?RCS: License or the Artistic License, as specified in the README file. ?RCS: -?MAKE:d_pthread_atfork: Inlibc cat Compile usethreads Setvar +?MAKE:d_pthread_atfork: Inlibc cat Compile usethreads Setvar d_gnulibc ?MAKE: -pick add $@ %< ?S:d_pthread_atfork: ?S: This variable conditionally defines the HAS_PTHREAD_ATFORK symbol, @@ -37,6 +37,12 @@ if eval $compile; then else val="$undef" fi +case "$d_gnulibc" in +*) + echo "Assuming pthread_atfork is broken, since this is glibc." + val="$undef" + ;; +esac case "$usethreads" in $define) case "$val" in ------------------------------------------- ```

And here's a test case for reproducing the bug (Leon was right to point out that without -DPURIFY\, which I had set but forgotten about\, it's not Perl's malloc that calls the real malloc()\, but S_more_refcounted_fds. However\, it's the same bug). This program should terminate (and would probably exhaust file descriptors without a breakpoint)\, but by merely setting the right breakpoint and attempting to continue once it's hit\, we can get it to deadlock (after opening a mere 16 file descriptors).


#!/usr/bin/perl # set a breakpoint in S_more_refcounted_fds before running this

use threads;

async {   my @​fh;

  for (my $i = 0; ; $i++) {   open($fh[$i]\, "\</dev/zero");   } };

sleep(1); fork();


To force the deadlock\, set a breakpoint in S_more_refcounted_fds\, then wait for a while (for the sleep(1) to finish) before continuing after the breakpoint is hit for the second time (the first time will be before the second thread is spawned).

As you can see in this rather long GDB transcript\, the bug is what I described​: thread 2 is trying to malloc() with perlio_mutex held\, thread 1 is trying to fork\, is already holding glibc's malloc mutex\, and is waiting on perlio_mutex.

Sorry again for the -DPURIFY confusion.

Philipp Rumpf


GDB transcript​: % gdb --args perl glibc-bug.pl gdb --args perl glibc-bug.pl GNU gdb (GDB) 7.6.2 (Debian 7.6.2-1) Copyright (C) 2013 Free Software Foundation\, Inc. License GPLv3+​: GNU GPL version 3 or later \<http​://gnu.org/licenses/gpl.html

This is free software​: you are free to change and redistribute it. There is NO WARRANTY\, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". For bug reporting instructions\, please see​: \<http​://www.gnu.org/software/gdb/bugs/>... Reading symbols from /usr/bin/perl...Reading symbols from /usr/lib/debug/usr/bin/perl...done. done. (gdb) r r Starting program​: /usr/bin/perl glibc-bug.pl warning​: Could not load shared library symbols for linux-vdso.so.1. Do you need "set solib-search-path" or "set sysroot"? [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". [New Thread 0x7ffff6b3c700 (LWP 19617)] Perl exited with active threads​:   1 running and unjoined   0 finished and unjoined   0 running and detached Perl exited with active threads​:   1 running and unjoined   0 finished and unjoined   0 running and detached [Thread 0x7ffff6b3c700 (LWP 19617) exited] [Inferior 1 (process 19613) exited normally] (gdb) b S_more_refcounted_fds b S_more_refcounted_fds Breakpoint 1 at 0x7ffff7b83060​: file perlio.c\, line 2320. (gdb) set target-async 1 set target-async 1 (gdb) set non-stop on set non-stop on (gdb) r r Starting program​: /usr/bin/perl glibc-bug.pl warning​: no loadable sections found in added symbol-file system-supplied DSO at 0x7ffff7ffa000 warning​: Could not load shared library symbols for linux-vdso.so.1. Do you need "set solib-search-path" or "set sysroot"? [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Breakpoint 1\, PerlIOUnix_refcnt_inc (fd=0) at perlio.c​:2372 2372 perlio.c​: No such file or directory. (gdb) shell sleep 5 shell sleep 5 (gdb) c c Continuing. [New Thread 0x7ffff6b3c700 (LWP 19621)]

Breakpoint 1\, PerlIOUnix_refcnt_inc (fd=16) at perlio.c​:2372 2372 in perlio.c (gdb) shell sleep 5 shell sleep 5 (gdb) c c Continuing. Cannot execute this command while the selected thread is running. (gdb) i thr i thr   Id Target Id Frame   2 Thread 0x7ffff6b3c700 (LWP 19621) "perl" PerlIOUnix_refcnt_inc (fd=16)   at perlio.c​:2372 * 1 Thread 0x7ffff7fd3700 (LWP 19619) "perl" (running) (gdb) thr 2 thr 2 [Switching to thread 2 (Thread 0x7ffff6b3c700 (LWP 19621))] #0 PerlIOUnix_refcnt_inc (fd=16) at perlio.c​:2372 2372 in perlio.c (gdb) c c Continuing.   C-c C-c^C Program received signal SIGINT\, Interrupt. __lll_lock_wait_private ()   at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S​:95 95 ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S​: No such file or directory. (gdb) interrupt -a interrupt -a (gdb) [Thread 0x7ffff7fd3700 (LWP 19619)] #1 stopped. __lll_lock_wait ()   at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S​:135 135 ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S​: No such file or directory.

(gdb) thr app all bt thr app all bt

Thread 2 (Thread 0x7ffff6b3c700 (LWP 19621))​: #0 __lll_lock_wait_private ()   at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S​:95 #1 0x00007ffff6ffc527 in _L_lock_10982 () at malloc.c​:5154 #2 0x00007ffff6ffa198 in __GI___libc_realloc (   oldmem=0x7ffff7321620 \<main_arena>\, bytes=128) at malloc.c​:2975 #3 0x00007ffff7b83098 in S_more_refcounted_fds (my_perl=0x6bfef0\, new_fd=16)   at perlio.c​:2334 #4 PerlIOUnix_refcnt_inc (fd=16) at perlio.c​:2372 #5 0x00007ffff7b839c4 in PerlIOUnix_setfd (my_perl=0x6bfef0\, f=0x6d8710\,   imode=0\, fd=\) at perlio.c​:2655 #6 PerlIOUnix_open (my_perl=0x6bfef0\, self=0x7ffff7ddc820 \<PerlIO_unix>\,   layers=0x6d84b0\, n=0\, mode=0x7ffff6b3ba70 "r"\, fd=\\,   imode=0\, perm=438\, f=0x6d8710\, narg=1\, args=0x7ffff6b3ba68)   at perlio.c​:2736 #7 0x00007ffff7b82c06 in PerlIOBuf_open (my_perl=0x6bfef0\,   self=0x7ffff7ddc660 \<PerlIO_perlio>\, layers=0x6d84b0\, n=1\,   mode=0x7ffff6b3ba70 "r"\, fd=-1\, imode=0\, perm=0\, f=0x0\, narg=1\,   args=0x7ffff6b3ba68) at perlio.c​:3862 #8 0x00007ffff7b84b2b in PerlIO_openn (my_perl=my_perl@​entry=0x6bfef0\,   layers=layers@​entry=0x0\, mode=mode@​entry=0x7ffff6b3ba70 "r"\,   fd=fd@​entry=-1\, imode=imode@​entry=0\, perm=perm@​entry=0\, f=f@​entry=0x0\,   narg=narg@​entry=1\, args=args@​entry=0x7ffff6b3ba68) at perlio.c​:1648 #9 0x00007ffff7b5d83e in Perl_do_openn (my_perl=my_perl@​entry=0x6bfef0\,   gv=gv@​entry=0x7362f8\, oname=0x724830 "\</dev/zero"\, len=\\,   as_raw=as_raw@​entry=0\, rawmode=rawmode@​entry=0\, rawperm=rawperm@​entry=0\,   supplied_fp=supplied_fp@​entry=0x0\, svp=0x7ffff6b3ba68\, num_svs=1\,   num_svs@​entry=0) at doio.c​:453 #10 0x00007ffff7b4c36e in Perl_pp_open (my_perl=0x6bfef0) at pp_sys.c​:640 #11 0x00007ffff7b05326 in Perl_runops_standard (my_perl=0x6bfef0) at run.c​:42 #12 0x00007ffff7a96930 in Perl_call_sv (my_perl=my_perl@​entry=0x6bfef0\,   sv=0x736058\, flags=\) at perl.c​:2766 #13 0x00007ffff6b43589 in S_ithread_run (arg=0x630020) at threads.xs​:517 #14 0x00007ffff732f062 in start_thread (arg=0x7ffff6b3c700)   at pthread_create.c​:312 #15 0x00007ffff7063a3d in clone ()   at ../sysdeps/unix/sysv/linux/x86_64/clone.S​:111

Thread 1 (Thread 0x7ffff7fd3700 (LWP 19619))​: #0 __lll_lock_wait ()   at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S​:135 #1 0x00007ffff7331467 in _L_lock_913 ()   from /lib/x86_64-linux-gnu/libpthread.so.0 #2 0x00007ffff7331290 in __GI___pthread_mutex_lock (   mutex=0x7ffff7ddce20 \<PL_perlio_mutex>) at ../nptl/pthread_mutex_lock.c​:79 #3 0x00007ffff7ae9c70 in Perl_atfork_lock () at util.c​:2811 #4 0x00007ffff7035122 in __libc_fork ()   at ../nptl/sysdeps/unix/sysv/linux/x86_64/../fork.c​:95 #5 0x00007ffff7338305 in __fork ()   at ../nptl/sysdeps/unix/sysv/linux/pt-fork.c​:25 #6 0x00007ffff7ae9d05 in Perl_my_fork () at util.c​:2849 #7 0x00007ffff7b556bc in Perl_pp_fork (my_perl=0x603010) at pp_sys.c​:4022 #8 0x00007ffff7b05326 in Perl_runops_standard (my_perl=0x603010) at run.c​:42 #9 0x00007ffff7a9dce4 in S_run_body (oldscope=1\, my_perl=0x603010)   at perl.c​:2467 #10 perl_run (my_perl=0x603010) at perl.c​:2383 #11 0x0000000000400e19 in main (argc=2\, argv=0x7fffffffeaf8\,   env=0x7fffffffeb10) at perlmain.c​:114


p5pRT commented 10 years ago

From prumpf@gmail.com

metaconfig-broken-pthread_atfork.diff ```diff diff --git a/U/threads/d_pthread_atfork.U b/U/threads/d_pthread_atfork.U index 77a8b43..9f0332a 100644 --- a/U/threads/d_pthread_atfork.U +++ b/U/threads/d_pthread_atfork.U @@ -5,7 +5,7 @@ ?RCS: You may distribute under the terms of either the GNU General Public ?RCS: License or the Artistic License, as specified in the README file. ?RCS: -?MAKE:d_pthread_atfork: Inlibc cat Compile usethreads Setvar +?MAKE:d_pthread_atfork: Inlibc cat Compile usethreads Setvar d_gnulibc ?MAKE: -pick add $@ %< ?S:d_pthread_atfork: ?S: This variable conditionally defines the HAS_PTHREAD_ATFORK symbol, @@ -37,6 +37,12 @@ if eval $compile; then else val="$undef" fi +case "$d_gnulibc" in +*) + echo "Assuming pthread_atfork is broken, since this is glibc." + val="$undef" + ;; +esac case "$usethreads" in $define) case "$val" in ```
p5pRT commented 10 years ago

From prumpf@gmail.com

glibc-bug.pl

p5pRT commented 10 years ago

From @tux

On Sat\, 29 Mar 2014 14​:46​:08 +0000\, Philipp Rumpf \prumpf@&#8203;gmail\.com wrote​:

Hello\, I tried responding via the perlbug system\, but that appears to be broken. Thank you for your responses so far!

As a reminder\, the bug is specific to glibc/nptl-based systems with ithreads\, such as x86_64-pc-linux-gnu.

I admire the fact that this is a genuine patch to the meta-system\, but looking at the scope\, I wonder if it better is located in hints/linux.sh

I've reported the issue on the glibc bugzilla after verifying it's not Debian-specific.

Here's a much simpler fix/workaround\, to metaconfig\, that we can use until fixed glibcs start appearing​:

--------------------------------------- diff --git a/U/threads/d_pthread_atfork.U b/U/threads/d_pthread_atfork.U index 77a8b43..9f0332a 100644 --- a/U/threads/d_pthread_atfork.U +++ b/U/threads/d_pthread_atfork.U @​@​ -5\,7 +5\,7 @​@​ ?RCS​: You may distribute under the terms of either the GNU General Public ?RCS​: License or the Artistic License\, as specified in the README file. ?RCS​: -?MAKE​:d_pthread_atfork​: Inlibc cat Compile usethreads Setvar +?MAKE​:d_pthread_atfork​: Inlibc cat Compile usethreads Setvar d_gnulibc ?MAKE​: -pick add $@​ %\< ?S​:d_pthread_atfork​: ?S​: This variable conditionally defines the HAS_PTHREAD_ATFORK symbol\, @​@​ -37\,6 +37\,12 @​@​ if eval $compile; then else val="$undef" fi +case "$d_gnulibc" in +*) + echo "Assuming pthread_atfork is broken\, since this is glibc." + val="$undef" + ;; +esac case "$usethreads" in $define) case "$val" in -------------------------------------------

And here's a test case for reproducing the bug (Leon was right to point out that without -DPURIFY\, which I had set but forgotten about\, it's not Perl's malloc that calls the real malloc()\, but S_more_refcounted_fds. However\, it's the same bug). This program should terminate (and would probably exhaust file descriptors without a breakpoint)\, but by merely setting the right breakpoint and attempting to continue once it's hit\, we can get it to deadlock (after opening a mere 16 file descriptors).

------------------------------------------ #!/usr/bin/perl # set a breakpoint in S_more_refcounted_fds before running this

use threads;

async { my @​fh;

for \(my $i = 0; ; $i\+\+\) \{
open\($fh\[$i\]\, "\</dev/zero"\);
\}

};

sleep(1); fork(); --------------------------------------

To force the deadlock\, set a breakpoint in S_more_refcounted_fds\, then wait for a while (for the sleep(1) to finish) before continuing after the breakpoint is hit for the second time (the first time will be before the second thread is spawned).

As you can see in this rather long GDB transcript\, the bug is what I described​: thread 2 is trying to malloc() with perlio_mutex held\, thread 1 is trying to fork\, is already holding glibc's malloc mutex\, and is waiting on perlio_mutex.

Sorry again for the -DPURIFY confusion.

Philipp Rumpf

-------------------------------------- GDB transcript​: % gdb --args perl glibc-bug.pl gdb --args perl glibc-bug.pl GNU gdb (GDB) 7.6.2 (Debian 7.6.2-1) Copyright (C) 2013 Free Software Foundation\, Inc. License GPLv3+​: GNU GPL version 3 or later \<http​://gnu.org/licenses/gpl.html

This is free software​: you are free to change and redistribute it. There is NO WARRANTY\, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". For bug reporting instructions\, please see​: \<http​://www.gnu.org/software/gdb/bugs/>... Reading symbols from /usr/bin/perl...Reading symbols from /usr/lib/debug/usr/bin/perl...done. done. (gdb) r r Starting program​: /usr/bin/perl glibc-bug.pl warning​: Could not load shared library symbols for linux-vdso.so.1. Do you need "set solib-search-path" or "set sysroot"? [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". [New Thread 0x7ffff6b3c700 (LWP 19617)] Perl exited with active threads​: 1 running and unjoined 0 finished and unjoined 0 running and detached Perl exited with active threads​: 1 running and unjoined 0 finished and unjoined 0 running and detached [Thread 0x7ffff6b3c700 (LWP 19617) exited] [Inferior 1 (process 19613) exited normally] (gdb) b S_more_refcounted_fds b S_more_refcounted_fds Breakpoint 1 at 0x7ffff7b83060​: file perlio.c\, line 2320. (gdb) set target-async 1 set target-async 1 (gdb) set non-stop on set non-stop on (gdb) r r Starting program​: /usr/bin/perl glibc-bug.pl warning​: no loadable sections found in added symbol-file system-supplied DSO at 0x7ffff7ffa000 warning​: Could not load shared library symbols for linux-vdso.so.1. Do you need "set solib-search-path" or "set sysroot"? [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Breakpoint 1\, PerlIOUnix_refcnt_inc (fd=0) at perlio.c​:2372 2372 perlio.c​: No such file or directory. (gdb) shell sleep 5 shell sleep 5 (gdb) c c Continuing. [New Thread 0x7ffff6b3c700 (LWP 19621)]

Breakpoint 1\, PerlIOUnix_refcnt_inc (fd=16) at perlio.c​:2372 2372 in perlio.c (gdb) shell sleep 5 shell sleep 5 (gdb) c c Continuing. Cannot execute this command while the selected thread is running. (gdb) i thr i thr Id Target Id Frame 2 Thread 0x7ffff6b3c700 (LWP 19621) "perl" PerlIOUnix_refcnt_inc (fd=16) at perlio.c​:2372 * 1 Thread 0x7ffff7fd3700 (LWP 19619) "perl" (running) (gdb) thr 2 thr 2 [Switching to thread 2 (Thread 0x7ffff6b3c700 (LWP 19621))] #0 PerlIOUnix_refcnt_inc (fd=16) at perlio.c​:2372 2372 in perlio.c (gdb) c c Continuing. C-c C-c^C Program received signal SIGINT\, Interrupt. __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S​:95 95 ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S​: No such file or directory. (gdb) interrupt -a interrupt -a (gdb) [Thread 0x7ffff7fd3700 (LWP 19619)] #1 stopped. __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S​:135 135 ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S​: No such file or directory.

(gdb) thr app all bt thr app all bt

Thread 2 (Thread 0x7ffff6b3c700 (LWP 19621))​: #0 __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S​:95 #1 0x00007ffff6ffc527 in _L_lock_10982 () at malloc.c​:5154 #2 0x00007ffff6ffa198 in __GI___libc_realloc ( oldmem=0x7ffff7321620 \<main_arena>\, bytes=128) at malloc.c​:2975 #3 0x00007ffff7b83098 in S_more_refcounted_fds (my_perl=0x6bfef0\, new_fd=16) at perlio.c​:2334 #4 PerlIOUnix_refcnt_inc (fd=16) at perlio.c​:2372 #5 0x00007ffff7b839c4 in PerlIOUnix_setfd (my_perl=0x6bfef0\, f=0x6d8710\, imode=0\, fd=\) at perlio.c​:2655 #6 PerlIOUnix_open (my_perl=0x6bfef0\, self=0x7ffff7ddc820 \<PerlIO_unix>\, layers=0x6d84b0\, n=0\, mode=0x7ffff6b3ba70 "r"\, fd=\\, imode=0\, perm=438\, f=0x6d8710\, narg=1\, args=0x7ffff6b3ba68) at perlio.c​:2736 #7 0x00007ffff7b82c06 in PerlIOBuf_open (my_perl=0x6bfef0\, self=0x7ffff7ddc660 \<PerlIO_perlio>\, layers=0x6d84b0\, n=1\, mode=0x7ffff6b3ba70 "r"\, fd=-1\, imode=0\, perm=0\, f=0x0\, narg=1\, args=0x7ffff6b3ba68) at perlio.c​:3862 #8 0x00007ffff7b84b2b in PerlIO_openn (my_perl=my_perl@​entry=0x6bfef0\, layers=layers@​entry=0x0\, mode=mode@​entry=0x7ffff6b3ba70 "r"\, fd=fd@​entry=-1\, imode=imode@​entry=0\, perm=perm@​entry=0\, f=f@​entry=0x0\, narg=narg@​entry=1\, args=args@​entry=0x7ffff6b3ba68) at perlio.c​:1648 #9 0x00007ffff7b5d83e in Perl_do_openn (my_perl=my_perl@​entry=0x6bfef0\, gv=gv@​entry=0x7362f8\, oname=0x724830 "\</dev/zero"\, len=\\, as_raw=as_raw@​entry=0\, rawmode=rawmode@​entry=0\, rawperm=rawperm@​entry=0\, supplied_fp=supplied_fp@​entry=0x0\, svp=0x7ffff6b3ba68\, num_svs=1\, num_svs@​entry=0) at doio.c​:453 #10 0x00007ffff7b4c36e in Perl_pp_open (my_perl=0x6bfef0) at pp_sys.c​:640 #11 0x00007ffff7b05326 in Perl_runops_standard (my_perl=0x6bfef0) at run.c​:42 #12 0x00007ffff7a96930 in Perl_call_sv (my_perl=my_perl@​entry=0x6bfef0\, sv=0x736058\, flags=\) at perl.c​:2766 #13 0x00007ffff6b43589 in S_ithread_run (arg=0x630020) at threads.xs​:517 #14 0x00007ffff732f062 in start_thread (arg=0x7ffff6b3c700) at pthread_create.c​:312 #15 0x00007ffff7063a3d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S​:111

Thread 1 (Thread 0x7ffff7fd3700 (LWP 19619))​: #0 __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S​:135 #1 0x00007ffff7331467 in _L_lock_913 () from /lib/x86_64-linux-gnu/libpthread.so.0 #2 0x00007ffff7331290 in __GI___pthread_mutex_lock ( mutex=0x7ffff7ddce20 \<PL_perlio_mutex>) at ../nptl/pthread_mutex_lock.c​:79 #3 0x00007ffff7ae9c70 in Perl_atfork_lock () at util.c​:2811 #4 0x00007ffff7035122 in __libc_fork () at ../nptl/sysdeps/unix/sysv/linux/x86_64/../fork.c​:95 #5 0x00007ffff7338305 in __fork () at ../nptl/sysdeps/unix/sysv/linux/pt-fork.c​:25 #6 0x00007ffff7ae9d05 in Perl_my_fork () at util.c​:2849 #7 0x00007ffff7b556bc in Perl_pp_fork (my_perl=0x603010) at pp_sys.c​:4022 #8 0x00007ffff7b05326 in Perl_runops_standard (my_perl=0x603010) at run.c​:42 #9 0x00007ffff7a9dce4 in S_run_body (oldscope=1\, my_perl=0x603010) at perl.c​:2467 #10 perl_run (my_perl=0x603010) at perl.c​:2383 #11 0x0000000000400e19 in main (argc=2\, argv=0x7fffffffeaf8\, env=0x7fffffffeb10) at perlmain.c​:114 ----------------------------------------------

-- H.Merijn Brand http​://tux.nl Perl Monger http​://amsterdam.pm.org/ using perl5.00307 .. 5.19 porting perl5 on HP-UX\, AIX\, and openSUSE http​://mirrors.develooper.com/hpux/ http​://www.test-smoke.org/ http​://qa.perl.org http​://www.goldmark.org/jeff/stupid-disclaimers/

p5pRT commented 10 years ago

From prumpf@gmail.com

On Mon\, Mar 31\, 2014 at 6​:28 AM\, H. Merijn Brand via RT \< perlbug-followup@​perl.org> wrote​:

I admire the fact that this is a genuine patch to the meta-system\, but looking at the scope\, I wonder if it better is located in hints/linux.sh

I don't know. The build system is a bit of a mystery to me (I'm not sure\, but I think the first patch was broken in the non-glibc case).

There are four options here​: put the test in metaconfig or the hints file\, and use version number testing or a test program. Testing by version numbers seems to be discouraged\, and while I have a test program\, the only easy way to tell whether it deadlocked is to wait for a timeout. I'm paranoid about that reporting false failures on very busy systems with fixed glibcs. In the failure case\, it also incurs a delay on the build system while it waits for the timeout—I chose two seconds\, we could probably get away with one second.

I'd argue that the code with the test program might well go into metaconfig​: pthread_atfork() is broken for all users\, not just Perl. The test isn't specific to glibc or linux—it should work on all POSIX systems\, and if it fails on a non-glibc system we definitely don't want to use pthread_atfork() there.

So I've attached the two test-program-based versions\, as patches to metaconfig and perl. Either one appears to work\, and installing both also appears to work.

Philipp

I've reported the issue on the glibc bugzilla after verifying it's not Debian-specific.

Here's a much simpler fix/workaround\, to metaconfig\, that we can use until fixed glibcs start appearing​:

--------------------------------------- diff --git a/U/threads/d_pthread_atfork.U b/U/threads/d_pthread_atfork.U index 77a8b43..9f0332a 100644 --- a/U/threads/d_pthread_atfork.U +++ b/U/threads/d_pthread_atfork.U @​@​ -5\,7 +5\,7 @​@​ ?RCS​: You may distribute under the terms of either the GNU General Public ?RCS​: License or the Artistic License\, as specified in the README file. ?RCS​: -?MAKE​:d_pthread_atfork​: Inlibc cat Compile usethreads Setvar +?MAKE​:d_pthread_atfork​: Inlibc cat Compile usethreads Setvar d_gnulibc ?MAKE​: -pick add $@​ %\< ?S​:d_pthread_atfork​: ?S​: This variable conditionally defines the HAS_PTHREAD_ATFORK symbol\, @​@​ -37\,6 +37\,12 @​@​ if eval $compile; then else val="$undef" fi +case "$d_gnulibc" in +*) + echo "Assuming pthread_atfork is broken\, since this is glibc." + val="$undef" + ;; +esac case "$usethreads" in $define) case "$val" in -------------------------------------------

And here's a test case for reproducing the bug (Leon was right to point out that without -DPURIFY\, which I had set but forgotten about\, it's not Perl's malloc that calls the real malloc()\, but S_more_refcounted_fds. However\, it's the same bug). This program should terminate (and would probably exhaust file descriptors without a breakpoint)\, but by merely setting the right breakpoint and attempting to continue once it's hit\, we can get it to deadlock (after opening a mere 16 file descriptors).

------------------------------------------ #!/usr/bin/perl # set a breakpoint in S_more_refcounted_fds before running this

use threads;

async { my @​fh;

for \(my $i = 0; ; $i\+\+\) \{
open\($fh\[$i\]\, "\</dev/zero"\);
\}

};

sleep(1); fork(); --------------------------------------

To force the deadlock\, set a breakpoint in S_more_refcounted_fds\, then wait for a while (for the sleep(1) to finish) before continuing after the breakpoint is hit for the second time (the first time will be before the second thread is spawned).

As you can see in this rather long GDB transcript\, the bug is what I described​: thread 2 is trying to malloc() with perlio_mutex held\, thread 1 is trying to fork\, is already holding glibc's malloc mutex\, and is waiting on perlio_mutex.

Sorry again for the -DPURIFY confusion.

Philipp Rumpf

-------------------------------------- GDB transcript​: % gdb --args perl glibc-bug.pl gdb --args perl glibc-bug.pl GNU gdb (GDB) 7.6.2 (Debian 7.6.2-1) Copyright (C) 2013 Free Software Foundation\, Inc. License GPLv3+​: GNU GPL version 3 or later \< http​://gnu.org/licenses/gpl.html

This is free software​: you are free to change and redistribute it. There is NO WARRANTY\, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". For bug reporting instructions\, please see​: \<http​://www.gnu.org/software/gdb/bugs/>... Reading symbols from /usr/bin/perl...Reading symbols from /usr/lib/debug/usr/bin/perl...done. done. (gdb) r r Starting program​: /usr/bin/perl glibc-bug.pl warning​: Could not load shared library symbols for linux-vdso.so.1. Do you need "set solib-search-path" or "set sysroot"? [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". [New Thread 0x7ffff6b3c700 (LWP 19617)] Perl exited with active threads​: 1 running and unjoined 0 finished and unjoined 0 running and detached Perl exited with active threads​: 1 running and unjoined 0 finished and unjoined 0 running and detached [Thread 0x7ffff6b3c700 (LWP 19617) exited] [Inferior 1 (process 19613) exited normally] (gdb) b S_more_refcounted_fds b S_more_refcounted_fds Breakpoint 1 at 0x7ffff7b83060​: file perlio.c\, line 2320. (gdb) set target-async 1 set target-async 1 (gdb) set non-stop on set non-stop on (gdb) r r Starting program​: /usr/bin/perl glibc-bug.pl warning​: no loadable sections found in added symbol-file system-supplied DSO at 0x7ffff7ffa000 warning​: Could not load shared library symbols for linux-vdso.so.1. Do you need "set solib-search-path" or "set sysroot"? [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Breakpoint 1\, PerlIOUnix_refcnt_inc (fd=0) at perlio.c​:2372 2372 perlio.c​: No such file or directory. (gdb) shell sleep 5 shell sleep 5 (gdb) c c Continuing. [New Thread 0x7ffff6b3c700 (LWP 19621)]

Breakpoint 1\, PerlIOUnix_refcnt_inc (fd=16) at perlio.c​:2372 2372 in perlio.c (gdb) shell sleep 5 shell sleep 5 (gdb) c c Continuing. Cannot execute this command while the selected thread is running. (gdb) i thr i thr Id Target Id Frame 2 Thread 0x7ffff6b3c700 (LWP 19621) "perl" PerlIOUnix_refcnt_inc (fd=16) at perlio.c​:2372 * 1 Thread 0x7ffff7fd3700 (LWP 19619) "perl" (running) (gdb) thr 2 thr 2 [Switching to thread 2 (Thread 0x7ffff6b3c700 (LWP 19621))] #0 PerlIOUnix_refcnt_inc (fd=16) at perlio.c​:2372 2372 in perlio.c (gdb) c c Continuing. C-c C-c^C Program received signal SIGINT\, Interrupt. __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S​:95 95 ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S​: No such file or directory. (gdb) interrupt -a interrupt -a (gdb) [Thread 0x7ffff7fd3700 (LWP 19619)] #1 stopped. __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S​:135 135 ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S​: No such file or directory.

(gdb) thr app all bt thr app all bt

Thread 2 (Thread 0x7ffff6b3c700 (LWP 19621))​: #0 __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S​:95 #1 0x00007ffff6ffc527 in _L_lock_10982 () at malloc.c​:5154 #2 0x00007ffff6ffa198 in __GI___libc_realloc ( oldmem=0x7ffff7321620 \<main_arena>\, bytes=128) at malloc.c​:2975 #3 0x00007ffff7b83098 in S_more_refcounted_fds (my_perl=0x6bfef0\, new_fd=16) at perlio.c​:2334 #4 PerlIOUnix_refcnt_inc (fd=16) at perlio.c​:2372 #5 0x00007ffff7b839c4 in PerlIOUnix_setfd (my_perl=0x6bfef0\, f=0x6d8710\, imode=0\, fd=\) at perlio.c​:2655 #6 PerlIOUnix_open (my_perl=0x6bfef0\, self=0x7ffff7ddc820 \<PerlIO_unix>\, layers=0x6d84b0\, n=0\, mode=0x7ffff6b3ba70 "r"\, fd=\\, imode=0\, perm=438\, f=0x6d8710\, narg=1\, args=0x7ffff6b3ba68) at perlio.c​:2736 #7 0x00007ffff7b82c06 in PerlIOBuf_open (my_perl=0x6bfef0\, self=0x7ffff7ddc660 \<PerlIO_perlio>\, layers=0x6d84b0\, n=1\, mode=0x7ffff6b3ba70 "r"\, fd=-1\, imode=0\, perm=0\, f=0x0\, narg=1\, args=0x7ffff6b3ba68) at perlio.c​:3862 #8 0x00007ffff7b84b2b in PerlIO_openn (my_perl=my_perl@​entry=0x6bfef0\, layers=layers@​entry=0x0\, mode=mode@​entry=0x7ffff6b3ba70 "r"\, fd=fd@​entry=-1\, imode=imode@​entry=0\, perm=perm@​entry=0\, f=f@​entry =0x0\, narg=narg@​entry=1\, args=args@​entry=0x7ffff6b3ba68) at perlio.c​:1648 #9 0x00007ffff7b5d83e in Perl_do_openn (my_perl=my_perl@​entry=0x6bfef0\, gv=gv@​entry=0x7362f8\, oname=0x724830 "\</dev/zero"\, len=\<optimized out>\, as_raw=as_raw@​entry=0\, rawmode=rawmode@​entry=0\, rawperm=rawperm@​entry=0\, supplied_fp=supplied_fp@​entry=0x0\, svp=0x7ffff6b3ba68\, num_svs=1\, num_svs@​entry=0) at doio.c​:453 #10 0x00007ffff7b4c36e in Perl_pp_open (my_perl=0x6bfef0) at pp_sys.c​:640 #11 0x00007ffff7b05326 in Perl_runops_standard (my_perl=0x6bfef0) at run.c​:42 #12 0x00007ffff7a96930 in Perl_call_sv (my_perl=my_perl@​entry=0x6bfef0\, sv=0x736058\, flags=\) at perl.c​:2766 #13 0x00007ffff6b43589 in S_ithread_run (arg=0x630020) at threads.xs​:517 #14 0x00007ffff732f062 in start_thread (arg=0x7ffff6b3c700) at pthread_create.c​:312 #15 0x00007ffff7063a3d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S​:111

Thread 1 (Thread 0x7ffff7fd3700 (LWP 19619))​: #0 __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S​:135 #1 0x00007ffff7331467 in _L_lock_913 () from /lib/x86_64-linux-gnu/libpthread.so.0 #2 0x00007ffff7331290 in __GI___pthread_mutex_lock ( mutex=0x7ffff7ddce20 \<PL_perlio_mutex>) at ../nptl/pthread_mutex_lock.c​:79 #3 0x00007ffff7ae9c70 in Perl_atfork_lock () at util.c​:2811 #4 0x00007ffff7035122 in __libc_fork () at ../nptl/sysdeps/unix/sysv/linux/x86_64/../fork.c​:95 #5 0x00007ffff7338305 in __fork () at ../nptl/sysdeps/unix/sysv/linux/pt-fork.c​:25 #6 0x00007ffff7ae9d05 in Perl_my_fork () at util.c​:2849 #7 0x00007ffff7b556bc in Perl_pp_fork (my_perl=0x603010) at pp_sys.c​:4022 #8 0x00007ffff7b05326 in Perl_runops_standard (my_perl=0x603010) at run.c​:42 #9 0x00007ffff7a9dce4 in S_run_body (oldscope=1\, my_perl=0x603010) at perl.c​:2467 #10 perl_run (my_perl=0x603010) at perl.c​:2383 #11 0x0000000000400e19 in main (argc=2\, argv=0x7fffffffeaf8\, env=0x7fffffffeb10) at perlmain.c​:114 ----------------------------------------------

-- H.Merijn Brand http​://tux.nl Perl Monger http​://amsterdam.pm.org/ using perl5.00307 .. 5.19 porting perl5 on HP-UX\, AIX\, and openSUSE http​://mirrors.develooper.com/hpux/ http​://www.test-smoke.org/ http​://qa.perl.org http​://www.goldmark.org/jeff/stupid-disclaimers/

p5pRT commented 10 years ago

From prumpf@gmail.com

perl-hints-002.diff ```diff diff --git a/hints/linux.sh b/hints/linux.sh index 956adfc..d1e2737 100644 --- a/hints/linux.sh +++ b/hints/linux.sh @@ -516,3 +516,104 @@ case "$libdb_needs_pthread" in libswanted="$libswanted pthread" ;; esac + +cat >try.c <<'EOM' +#include +#include +#include +#include +#include + +pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER; + +int pipe_fd[4]; + +void lock(void) +{ + if (write(pipe_fd[3], "\n", 1) <= 0) { + _exit(1); + } + pthread_mutex_lock(&mutex); +} + +void *lock_then_malloc(void *dummy) +{ + char c; + + pthread_mutex_lock(&mutex); + if (write(pipe_fd[1], "\n", 1) <= 0) { + _exit(1); + } + + if (read(pipe_fd[2], &c, 1) <= 0) { + _exit(1); + } + volatile void *throwaway = malloc(1024); + pthread_mutex_unlock(&mutex); + + return NULL; +} + +void alarm_handler(int dummy) +{ + _exit(1); +} + +struct sigaction sa; + +int main(int argc, char **argv) +{ + pthread_attr_t attr; + pthread_t tid; + + if (pthread_atfork(lock, NULL, NULL)) { + return 1; + } + volatile void *throwaway = malloc(1024); + + if (pipe(pipe_fd)) { + return 1; + } + + if (pipe(pipe_fd+2)) { + return 1; + } + + if (pthread_attr_init(&attr)) { + return 1; + } + if (pthread_create(&tid, &attr, lock_then_malloc, NULL)) { + return 1; + } + + char c; + if (read(pipe_fd[0], &c, 1) <= 0) { + return 1; + } + + sa.sa_handler = alarm_handler; + sigemptyset(&sa.sa_mask); + if (sigaction(SIGALRM, &sa, NULL)) { + return 1; + } + alarm(2); + + if (fork() < 0) + return 1; + + return 0; +} +EOM + +if ${cc:-gcc} $ccflags $ldflags try.c -lpthread >/dev/null 2>&1 && $run ./a.out; then + cat <<'EOM' >&4 + +You appear to have a working pthread_atfork(). +EOM +else + cat <<'EOM' >&4 + +Your pthread_atfork() might be broken, not using it. +EOM + d_pthread_atfork='undef' +fi ```
p5pRT commented 10 years ago

From prumpf@gmail.com

metaconfig-pthread-002.diff ```diff diff --git a/U/threads/d_pthread_atfork.U b/U/threads/d_pthread_atfork.U index 77a8b43..2a84eac 100644 --- a/U/threads/d_pthread_atfork.U +++ b/U/threads/d_pthread_atfork.U @@ -1,11 +1,13 @@ ?RCS: $Id$ ?RCS: ?RCS: Copyright (c) 2001 Jarkko Hietaniemi +?RCS: Parts taken from d_pthreadj.U, which is: +?RCS: Copyright (c) 1998 Andy Dougherty ?RCS: ?RCS: You may distribute under the terms of either the GNU General Public ?RCS: License or the Artistic License, as specified in the README file. ?RCS: -?MAKE:d_pthread_atfork: Inlibc cat Compile usethreads Setvar +?MAKE:d_pthread_atfork: Inlibc cat Compile Setvar run rm ?MAKE: -pick add $@ %< ?S:d_pthread_atfork: ?S: This variable conditionally defines the HAS_PTHREAD_ATFORK symbol, @@ -19,30 +21,112 @@ ?H:#$d_pthread_atfork HAS_PTHREAD_ATFORK /**/ ?H:. ?LINT:set d_pthread_atfork -: see whether the pthread_atfork exists -$cat >try.c < +?T:yyy +?F:!try +: see whether pthread_atfork exists and works +echo "Checking whether pthread_atfork is usable..." >&4 +$cat >try.c <<'EOP' #include -int main() { -#ifdef PTHREAD_ATFORK - pthread_atfork(NULL,NULL,NULL); -#endif +#include +#include +#include +#include + +pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER; + +int pipe_fd[4]; + +void lock(void) +{ + if (write(pipe_fd[3], "\n", 1) <= 0) { + _exit(1); + } + pthread_mutex_lock(&mutex); +} + +void *lock_then_malloc(void *dummy) +{ + char c; + + pthread_mutex_lock(&mutex); + if (write(pipe_fd[1], "\n", 1) <= 0) { + _exit(1); + } + + if (read(pipe_fd[2], &c, 1) <= 0) { + _exit(1); + } + volatile void *throwaway = malloc(1024); + pthread_mutex_unlock(&mutex); + + return NULL; +} + +void alarm_handler(int dummy) +{ + _exit(1); +} + +struct sigaction sa; + +int main(int argc, char **argv) +{ + pthread_attr_t attr; + pthread_t tid; + + if (pthread_atfork(lock, NULL, NULL)) { + return 1; + } + volatile void *throwaway = malloc(1024); + + if (pipe(pipe_fd)) { + return 1; + } + + if (pipe(pipe_fd+2)) { + return 1; + } + + if (pthread_attr_init(&attr)) { + return 1; + } + if (pthread_create(&tid, &attr, lock_then_malloc, NULL)) { + return 1; + } + + char c; + if (read(pipe_fd[0], &c, 1) <= 0) { + return 1; + } + + sa.sa_handler = alarm_handler; + sigemptyset(&sa.sa_mask); + if (sigaction(SIGALRM, &sa, NULL)) { + return 1; + } + alarm(2); + + int ret = fork(); + if (ret < 0) + return 1; + + if (ret == 0) + printf("success\n"); + return 0; } EOP -: see if pthread_atfork exists -set try -DPTHREAD_ATFORK +: see if pthread_atfork exists and works +set try if eval $compile; then - val="$define" + yyy=`$run ./try` else val="$undef" fi -case "$usethreads" in -$define) - case "$val" in - $define) echo 'pthread_atfork found.' >&4 ;; - *) echo 'pthread_atfork NOT found.' >&4 ;; - esac +$rm -f try try.* +case "$yyy" in + success) echo "It does work." >&4; val="$define" ;; + *) echo "Doesn't work." >&4; val="$undef" ;; esac set d_pthread_atfork eval $setvar ```
p5pRT commented 10 years ago

From @Leont

On Tue\, Apr 1\, 2014 at 5​:12 PM\, Philipp Rumpf \prumpf@&#8203;gmail\.com wrote​:

On Mon\, Mar 31\, 2014 at 6​:28 AM\, H. Merijn Brand via RT \< perlbug-followup@​perl.org> wrote​:

I admire the fact that this is a genuine patch to the meta-system\, but looking at the scope\, I wonder if it better is located in hints/linux.sh

I don't know. The build system is a bit of a mystery to me (I'm not sure\, but I think the first patch was broken in the non-glibc case).

There are four options here​: put the test in metaconfig or the hints file\, and use version number testing or a test program. Testing by version numbers seems to be discouraged\, and while I have a test program\, the only easy way to tell whether it deadlocked is to wait for a timeout. I'm paranoid about that reporting false failures on very busy systems with fixed glibcs. In the failure case\, it also incurs a delay on the build system while it waits for the timeout—I chose two seconds\, we could probably get away with one second.

I'd argue that the code with the test program might well go into metaconfig​: pthread_atfork() is broken for all users\, not just Perl. The test isn't specific to glibc or linux—it should work on all POSIX systems\, and if it fails on a non-glibc system we definitely don't want to use pthread_atfork() there.

So I've attached the two test-program-based versions\, as patches to metaconfig and perl. Either one appears to work\, and installing both also appears to work.

But you now introduced exactly the deadlock that the use of pthread_at_fork was supposed to fix​: if thread 1 forks while thread 2 holds a perl mutex\, the new process will deadlock as soon as it tries to acquire that mutex.

This is not a solution in any way.

Leon

p5pRT commented 10 years ago

From prumpf@gmail.com

If HAS_PTHREAD_ATFORK is undefined\, Perl_my_fork() calls the same handlers that would otherwise have been installed by pthread_atfork(). So the deadlock you describe would only happen if someone called fork() (the C function\, not the Perl function) directly\, rather than going through Perl_my_fork(). Is that the case you're worrying about?

If the malloc hack is considered the better workaround\, we can do that\, of course.

On Thu\, Apr 3\, 2014 at 5​:48 PM\, Leon Timmermans via RT \< perlbug-followup@​perl.org> wrote​:

On Tue\, Apr 1\, 2014 at 5​:12 PM\, Philipp Rumpf \prumpf@&#8203;gmail\.com wrote​:

On Mon\, Mar 31\, 2014 at 6​:28 AM\, H. Merijn Brand via RT \< perlbug-followup@​perl.org> wrote​:

I admire the fact that this is a genuine patch to the meta-system\, but looking at the scope\, I wonder if it better is located in hints/linux.sh

I don't know. The build system is a bit of a mystery to me (I'm not sure\, but I think the first patch was broken in the non-glibc case).

There are four options here​: put the test in metaconfig or the hints file\, and use version number testing or a test program. Testing by version numbers seems to be discouraged\, and while I have a test program\, the only easy way to tell whether it deadlocked is to wait for a timeout. I'm paranoid about that reporting false failures on very busy systems with fixed glibcs. In the failure case\, it also incurs a delay on the build system while it waits for the timeout—I chose two seconds\, we could probably get away with one second.

I'd argue that the code with the test program might well go into metaconfig​: pthread_atfork() is broken for all users\, not just Perl. The test isn't specific to glibc or linux—it should work on all POSIX systems\, and if it fails on a non-glibc system we definitely don't want to use pthread_atfork() there.

So I've attached the two test-program-based versions\, as patches to metaconfig and perl. Either one appears to work\, and installing both also appears to work.

But you now introduced exactly the deadlock that the use of pthread_at_fork was supposed to fix​: if thread 1 forks while thread 2 holds a perl mutex\, the new process will deadlock as soon as it tries to acquire that mutex.

This is not a solution in any way.

Leon

p5pRT commented 10 years ago

From prumpf@gmail.com

If it's possible to add a configuration variable in hints/linux.sh\, I haven't figured out how. So here's the version that changes metaconfig\, but uses the malloc() hack. Applications that embed Perl are likely to copy-and-paste the code that calls PTHREAD_ATFORK\, so I've exported the Perl_atfork_fix symbol; they're very likely not to need the workaround\, anyway.

On Thu\, Apr 3\, 2014 at 10​:15 PM\, Philipp Rumpf \prumpf@&#8203;gmail\.com wrote​:

If HAS_PTHREAD_ATFORK is undefined\, Perl_my_fork() calls the same handlers that would otherwise have been installed by pthread_atfork(). So the deadlock you describe would only happen if someone called fork() (the C function\, not the Perl function) directly\, rather than going through Perl_my_fork(). Is that the case you're worrying about?

If the malloc hack is considered the better workaround\, we can do that\, of course.

On Thu\, Apr 3\, 2014 at 5​:48 PM\, Leon Timmermans via RT \< perlbug-followup@​perl.org> wrote​:

On Tue\, Apr 1\, 2014 at 5​:12 PM\, Philipp Rumpf \prumpf@&#8203;gmail\.com wrote​:

On Mon\, Mar 31\, 2014 at 6​:28 AM\, H. Merijn Brand via RT \< perlbug-followup@​perl.org> wrote​:

I admire the fact that this is a genuine patch to the meta-system\, but looking at the scope\, I wonder if it better is located in hints/linux.sh

I don't know. The build system is a bit of a mystery to me (I'm not sure\, but I think the first patch was broken in the non-glibc case).

There are four options here​: put the test in metaconfig or the hints file\, and use version number testing or a test program. Testing by version numbers seems to be discouraged\, and while I have a test program\, the only easy way to tell whether it deadlocked is to wait for a timeout. I'm paranoid about that reporting false failures on very busy systems with fixed glibcs. In the failure case\, it also incurs a delay on the build system while it waits for the timeout—I chose two seconds\, we could probably get away with one second.

I'd argue that the code with the test program might well go into metaconfig​: pthread_atfork() is broken for all users\, not just Perl. The test isn't specific to glibc or linux—it should work on all POSIX systems\, and if it fails on a non-glibc system we definitely don't want to use pthread_atfork() there.

So I've attached the two test-program-based versions\, as patches to metaconfig and perl. Either one appears to work\, and installing both also appears to work.

But you now introduced exactly the deadlock that the use of pthread_at_fork was supposed to fix​: if thread 1 forks while thread 2 holds a perl mutex\, the new process will deadlock as soon as it tries to acquire that mutex.

This is not a solution in any way.

Leon

p5pRT commented 10 years ago

From prumpf@gmail.com

perl-deadlock-workaround-004.diff ```diff diff --git a/cpan/Devel-PPPort/parts/embed.fnc b/cpan/Devel-PPPort/parts/embed.fnc index e076893..b8ba5c0 100644 --- a/cpan/Devel-PPPort/parts/embed.fnc +++ b/cpan/Devel-PPPort/parts/embed.fnc @@ -877,6 +877,7 @@ Apr |void |my_exit |U32 status Apr |void |my_failure_exit Ap |I32 |my_fflush_all Anp |Pid_t |my_fork +np |void |atfork_fix Anp |void |atfork_lock Anp |void |atfork_unlock Apmb |I32 |my_lstat diff --git a/embed.fnc b/embed.fnc index 567e587..16615e8 100644 --- a/embed.fnc +++ b/embed.fnc @@ -898,6 +898,7 @@ Apr |void |my_exit |U32 status Apr |void |my_failure_exit Ap |I32 |my_fflush_all Anp |Pid_t |my_fork +np |void |atfork_fix Anp |void |atfork_lock Anp |void |atfork_unlock Apmb |I32 |my_lstat diff --git a/embed.h b/embed.h index 0ddaca7..18c02f1 100644 --- a/embed.h +++ b/embed.h @@ -1027,6 +1027,7 @@ #define allocmy(a,b,c) Perl_allocmy(aTHX_ a,b,c) #define amagic_is_enabled(a) Perl_amagic_is_enabled(aTHX_ a) #define apply(a,b,c) Perl_apply(aTHX_ a,b,c) +#define atfork_fix Perl_atfork_fix #define av_extend_guts(a,b,c,d,e) Perl_av_extend_guts(aTHX_ a,b,c,d,e) #define bind_match(a,b,c) Perl_bind_match(aTHX_ a,b,c) #define block_end(a,b) Perl_block_end(aTHX_ a,b) diff --git a/ext/ExtUtils-Miniperl/lib/ExtUtils/Miniperl.pm b/ext/ExtUtils-Miniperl/lib/ExtUtils/Miniperl.pm index 730c565..b486e20 100644 --- a/ext/ExtUtils-Miniperl/lib/ExtUtils/Miniperl.pm +++ b/ext/ExtUtils-Miniperl/lib/ExtUtils/Miniperl.pm @@ -129,6 +129,9 @@ main(int argc, char **argv, char **env) * call PTHREAD_ATFORK() explicitly, but if and only if it hasn't * been called at least once before in the current process. * --GSAR 2001-07-20 */ +#ifdef USE_PTHREAD_ATFORK_MALLOC_HACK + Perl_atfork_fix(); +#endif PTHREAD_ATFORK(Perl_atfork_lock, Perl_atfork_unlock, Perl_atfork_unlock); diff --git a/proto.h b/proto.h index dd5edde..bbba40a 100644 --- a/proto.h +++ b/proto.h @@ -140,6 +140,7 @@ PERL_CALLCONV void Perl_apply_attrs_string(pTHX_ const char *stashpv, CV *cv, co #define PERL_ARGS_ASSERT_APPLY_ATTRS_STRING \ assert(stashpv); assert(cv); assert(attrstr) +PERL_CALLCONV void Perl_atfork_fix(void); PERL_CALLCONV void Perl_atfork_lock(void); PERL_CALLCONV void Perl_atfork_unlock(void); PERL_CALLCONV SV** Perl_av_arylen_p(pTHX_ AV *av) diff --git a/util.c b/util.c index a5451c1..df26259 100644 --- a/util.c +++ b/util.c @@ -2569,6 +2569,19 @@ Perl_my_popen(pTHX_ const char *cmd, const char *mode) #endif /* !DOSISH */ +#ifdef USE_PTHREAD_ATFORK_MALLOC_HACK +/* needs to be global so GCC doesn't optimize away the malloc() */ +void *pthread_atfork_fix_pointer; + +void Perl_atfork_fix(void) +{ + /* To avoid a deadlock situation, glibc's malloc must be initialized + * before we call pthread_atfork. We can't just use (void)malloc(0) + * because GCC removes such calls. */ + pthread_atfork_fix_pointer = malloc(0); +} +#endif + /* this is called in parent before the fork() */ void Perl_atfork_lock(void) ```
p5pRT commented 10 years ago

From prumpf@gmail.com

metaconfig-pthread-004.diff ```diff diff --git a/U/threads/d_pthread_atfork.U b/U/threads/d_pthread_atfork.U index 77a8b43..2cc9eca 100644 --- a/U/threads/d_pthread_atfork.U +++ b/U/threads/d_pthread_atfork.U @@ -1,24 +1,38 @@ ?RCS: $Id$ ?RCS: ?RCS: Copyright (c) 2001 Jarkko Hietaniemi +?RCS: Parts taken from d_pthreadj.U, which is: +?RCS: Copyright (c) 1998 Andy Dougherty ?RCS: ?RCS: You may distribute under the terms of either the GNU General Public ?RCS: License or the Artistic License, as specified in the README file. ?RCS: -?MAKE:d_pthread_atfork: Inlibc cat Compile usethreads Setvar +?MAKE:d_pthread_atfork d_pthread_atfork_malloc_hack: Inlibc cat Compile Setvar run rm usethreads ?MAKE: -pick add $@ %< ?S:d_pthread_atfork: ?S: This variable conditionally defines the HAS_PTHREAD_ATFORK symbol, ?S: which indicates to the C program that the pthread_atfork() ?S: routine is available. ?S:. +?S:d_pthread_atfork_malloc_hack: +?S: This variable conditionally defines the USE_PTHREAD_ATFORK_MALLOC_HACK +?S: symbol, which indicates to the C program that malloc() needs to be +?S: called before pthread_atfork() is. +?S:. ?C:HAS_PTHREAD_ATFORK: ?C: This symbol, if defined, indicates that the pthread_atfork routine ?C: is available to setup fork handlers. ?C:. +?C:USE_PTHREAD_ATFORK_MALLOC_HACK: +?C: This symbol, if defined, indicates that pthread_atfork is broken +?C: unless malloc is called before it. +?C:. ?H:#$d_pthread_atfork HAS_PTHREAD_ATFORK /**/ +?H:#$d_pthread_atfork_malloc_hack USE_PTHREAD_ATFORK_MALLOC_HACK /**/ ?H:. -?LINT:set d_pthread_atfork +?LINT:set d_pthread_atfork d_pthread_atfork_malloc_hack +?T:yyy +?F:!try : see whether the pthread_atfork exists $cat >try.c < @@ -47,3 +61,113 @@ esac set d_pthread_atfork eval $setvar +: see whether pthread_atfork exists and works +echo "Checking whether pthread_atfork requires a workaround..." >&4 +$cat >try.c <<'EOP' +#include +#include +#include +#include +#include + +pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER; + +int pipe_fd[4]; + +void lock(void) +{ + if (write(pipe_fd[3], "\n", 1) <= 0) { + _exit(1); + } + pthread_mutex_lock(&mutex); +} + +/* This needs to be a global variable, or GCC gets clever on us + * and throws out the malloc() call. */ +volatile void *throwaway; + +void *lock_then_malloc(void *dummy) +{ + char c; + + pthread_mutex_lock(&mutex); + if (write(pipe_fd[1], "\n", 1) <= 0) { + _exit(1); + } + + if (read(pipe_fd[2], &c, 1) <= 0) { + _exit(1); + } + throwaway = malloc(1024); + pthread_mutex_unlock(&mutex); + + return NULL; +} + +void alarm_handler(int dummy) +{ + _exit(1); +} + +struct sigaction sa; + +int main(int argc, char **argv) +{ + pthread_attr_t attr; + pthread_t tid; + char c; + int ret; + + if (pthread_atfork(lock, NULL, NULL)) { + return 1; + } + + if (pipe(pipe_fd)) { + return 1; + } + + if (pipe(pipe_fd+2)) { + return 1; + } + + if (pthread_attr_init(&attr)) { + return 1; + } + if (pthread_create(&tid, &attr, lock_then_malloc, NULL)) { + return 1; + } + + if (read(pipe_fd[0], &c, 1) <= 0) { + return 1; + } + + sa.sa_handler = alarm_handler; + sigemptyset(&sa.sa_mask); + if (sigaction(SIGALRM, &sa, NULL)) { + return 1; + } + alarm(2); + + ret = fork(); + if (ret < 0) + return 1; + + if (ret == 0) + printf("success\n"); + return 0; +} +EOP + +: see if pthread_atfork exists and works +set try +if eval $compile; then + yyy=`$run ./try` +fi +$rm -f try try.* +case "$yyy" in + success) echo "It does work without a workaround." >&4; val="$undef" ;; + *) echo "Workaround required." >&4; val="$define" ;; +esac +set d_pthread_atfork_malloc_hack +eval $setvar + ```