Perl / perl5

🐪 The Perl programming language
https://dev.perl.org/perl5/
Other
1.93k stars 552 forks source link

malloc deadlock in unsafe signal handler #15014

Open p5pRT opened 8 years ago

p5pRT commented 8 years ago

Migrated from rt.perl.org#126474 (status was 'open')

Searchable as RT126474$

p5pRT commented 8 years ago

From kazuhooku@gmail.com

Created by kazuhooku@gmail.com

This is a bug report for perl from kazuhooku@​gmail.com\, generated with the help of perlbug 1.40 running under perl 5.23.5.

----------------------------------------------------------------- Unsafe signal handler calls malloc (via newSVsv)\, therefore causes deadlock when signal is received while malloc is running.

When I run the following script\, and repeatedly send SIGUSR1 to the script (e.g. while 1 ; do kill -USR1 proc-id ; sleep 1; done)\, then the script enters a dead lock within the signal handler.

  use strict;   use warnings;   use POSIX qw(SIGUSR1);

  warn "pid​:$$\n";

  POSIX​::sigaction(SIGUSR1\, POSIX​::SigAction->new(sub {}))   or die "POSIX​::sigaction failed​:$!";

  while (1) {   my $a = [];   for my $i (1..1000000) {   push @​$a\, "​:$i";   }   print "hello\n";   }

Looking at gdb backtrace\, it is likely that the unsafe signal handler (triggered while malloc is running) is calling malloc (which is not a reentrant function).

  (gdb) bt   #0 __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S​:95   #1 0x00007f51774bfdca in _L_lock_12779 () at malloc.c​:5206   #2 0x00007f51774bd7a5 in __GI___libc_malloc (bytes=10) at malloc.c​:2887   #3 0x000000000048f775 in Perl_safesysmalloc ()   #4 0x00000000004bd230 in Perl_sv_grow ()   #5 0x00000000004bc1b8 in Perl_sv_setsv_flags ()   #6 0x00000000004c5f13 in Perl_newSVsv ()   #7 0x000000000049b383 in Perl_sighandler ()   #8 \   #9 malloc_consolidate (av=av@​entry=0x7f51777f9760 \<main_arena>) at malloc.c​:4173   #10 0x00007f51774ba56d in _int_free (av=0x7f51777f9760 \<main_arena>\, p=\\, have_lock=0) at malloc.c​:4057   #11 0x00000000004b5881 in Perl_sv_clear ()   #12 0x00000000004b5929 in Perl_sv_free2 ()   #13 0x00000000004dbb8c in Perl_leave_scope ()   #14 0x00000000004aa545 in Perl_pp_unstack ()   #15 0x00000000004a9b83 in Perl_runops_standard ()   #16 0x00000000004412d0 in perl_run ()   #17 0x000000000041fff5 in main ()

I believe this regression was introduced in 5.17.2 when #45173 (Arriving signals no longer clear $@​) got fixed.

Perl Info ``` Flags: category=core severity=low Site configuration information for perl 5.23.5: Configured by kazuho at Wed Oct 28 09:17:29 JST 2015. Summary of my perl5 (revision 5 version 23 subversion 5) configuration: Commit id: 6504068eb895d4fb6161ddbb0249e59c19afa707 Platform: osname=linux, osvers=3.13.0-65-generic, archname=x86_64-linux uname='linux ubuntu1404 3.13.0-65-generic #106-ubuntu smp fri oct 2 22:08:27 utc 2015 x86_64 x86_64 x86_64 gnulinux ' config_args='' hint=previous, useposix=true, d_sigaction=define useithreads=undef, usemultiplicity=undef use64bitint=define, use64bitall=define, uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='cc', ccflags ='-fwrapv -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64', optimize='-O2', cppflags='-fwrapv -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include' ccversion='', gccversion='4.8.2', gccosandvers='' intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678, doublekind=3 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16, longdblkind=3 ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=8, prototype=define Linker and Libraries: ld='cc', ldflags =' -fstack-protector -L/usr/local/lib' libpth=/usr/local/lib /usr/lib/gcc/x86_64-linux-gnu/4.8/include-fixed /usr/include/x86_64-linux-gnu /usr/lib /lib/x86_64-linux-gnu /lib/../lib /usr/lib/x86_64-linux-gnu /usr/lib/../lib /lib /usr/local/lib /usr/lib/gcc/x86_64-linux-gnu/4.8/include-fixed /usr/include/x86_64-linux-gnu /usr/lib libs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc perllibs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc libc=libc-2.19.so, so=so, useshrplib=false, libperl=libperl.a gnulibc_version='2.19' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E' cccdlflags='-fPIC', lddlflags='-shared -O2 -L/usr/local/lib -fstack-protector' @INC for perl 5.23.5: /usr/local/perl-bleed/lib/site_perl/5.23.5/x86_64-linux /usr/local/perl-bleed/lib/site_perl/5.23.5 /usr/local/perl-bleed/lib/5.23.5/x86_64-linux /usr/local/perl-bleed/lib/5.23.5 . Environment for perl 5.23.5: HOME=/home/kazuho LANG=en_US.UTF-8 LANGUAGE=en_US:en LD_LIBRARY_PATH (unset) LOGDIR (unset) PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games PERL_BADLANG (unset) SHELL=/bin/bash ```
p5pRT commented 8 years ago

From @tonycoz

On Wed Oct 28 03​:44​:27 2015\, kazuhooku@​gmail.com wrote​:

Unsafe signal handler calls malloc (via newSVsv)\, therefore causes deadlock when signal is received while malloc is running.

When I run the following script\, and repeatedly send SIGUSR1 to the script (e.g. while 1 ; do kill -USR1 proc-id ; sleep 1; done)\, then the script enters a dead lock within the signal handler.

I don't think this is a bug - it's the reason safe signal handlers were introduced.

Tony

p5pRT commented 8 years ago

The RT System itself - Status changed from 'new' to 'open'

p5pRT commented 8 years ago

From kazuhooku@gmail.com

Thank you for your response.

I disagree. It is true that the use of unsafe signal is discouraged\, but people have written code relying on unsafe signal handlers and IMO they should continue to work.

My understanding is that the introduction of safe signals in Perl 5.8 was to protect people from writing corrupt signal handlers (causing deadlocks or segfaults). But even after that\, Perl has provided freedom to use unsafe signal handlers at one's own risk\, as stated in perldoc perlipc\, and cautiously-coded signal handlers (call only async-signal-safe C functions\, or do nothing at all) have continued to work.

However (as I believe) in Perl 5.17\, the implementation of the signal handler in the perl core has been altered to _always_ call malloc (via newSVsv).

In other words\, starting from 5.17\, it has become _impossible_ to write an unsafe signal handler without fear of deadlock. And that is what I see as a regression.

2015-10-29 7​:34 GMT+09​:00 Tony Cook via RT \perlbug\-followup@&#8203;perl\.org​:

On Wed Oct 28 03​:44​:27 2015\, kazuhooku@​gmail.com wrote​:

Unsafe signal handler calls malloc (via newSVsv)\, therefore causes deadlock when signal is received while malloc is running.

When I run the following script\, and repeatedly send SIGUSR1 to the script (e.g. while 1 ; do kill -USR1 proc-id ; sleep 1; done)\, then the script enters a dead lock within the signal handler.

I don't think this is a bug - it's the reason safe signal handlers were introduced.

Tony

-- Kazuho Oku

p5pRT commented 8 years ago

From @Leont

On Thu\, Oct 29\, 2015 at 12​:50 AM\, Kazuho Oku \kazuhooku@&#8203;gmail\.com wrote​:

I disagree. It is true that the use of unsafe signal is discouraged\, but people have written code relying on unsafe signal handlers and IMO they should continue to work.

There is a very limited set of circumstances in which they can be relied upon to work. It assumes the «*+ptr = foo» is atomic and requires the value\, mark and the save stacks all to be pre-allocated (not impossible\, not even unlikely\, but still far from a guarantee)\, as well as a signal handler doing just about nothing that may malloc.

Signal handlers in POSIX have this concept of an alternative stack (mainly useful in case of a stack overflow)\, it may be useful here too.

My understanding is that the introduction of safe signals in Perl 5.8 was to protect people from writing corrupt signal handlers (causing deadlocks or segfaults). But even after that\, Perl has provided freedom to use unsafe signal handlers at one's own risk\, as stated in perldoc perlipc\, and cautiously-coded signal handlers (call only async-signal-safe C functions\, or do nothing at all) have continued to work.

I have a hard time imagining async-signal-safe perl code\, given that stack manipulation isn't guaranteed to be (even if statistically the odds of a stack reallocation are low).

However (as I believe) in Perl 5.17\, the implementation of the signal handler in the perl core has been altered to _always_ call malloc (via newSVsv).

In other words\, starting from 5.17\, it has become _impossible_ to write an unsafe signal handler without fear of deadlock. And that is what I see as a regression.

I think you're referring to 100c03aa. Possibly an alternative $@​ is the solution here too.

Leon

p5pRT commented 8 years ago

From kazuhooku@gmail.com

Thank you for your response.

2015-10-29 18​:30 GMT+09​:00 Leon Timmermans via RT \perlbug\-followup@&#8203;perl\.org​:

On Thu\, Oct 29\, 2015 at 12​:50 AM\, Kazuho Oku \kazuhooku@&#8203;gmail\.com wrote​:

I disagree. It is true that the use of unsafe signal is discouraged\, but people have written code relying on unsafe signal handlers and IMO they should continue to work.

There is a very limited set of circumstances in which they can be relied upon to work. It assumes the «*+ptr = foo» is atomic and requires the value\, mark and the save stacks all to be pre-allocated (not impossible\, not even unlikely\, but still far from a guarantee)\, as well as a signal handler doing just about nothing that may malloc.

Thank you for the clarification.

Signal handlers in POSIX have this concept of an alternative stack (mainly useful in case of a stack overflow)\, it may be useful here too.

Agreed.

My understanding is that the introduction of safe signals in Perl 5.8 was to protect people from writing corrupt signal handlers (causing deadlocks or segfaults). But even after that\, Perl has provided freedom to use unsafe signal handlers at one's own risk\, as stated in perldoc perlipc\, and cautiously-coded signal handlers (call only async-signal-safe C functions\, or do nothing at all) have continued to work.

I have a hard time imagining async-signal-safe perl code\, given that stack manipulation isn't guaranteed to be (even if statistically the odds of a stack reallocation are low).

For example\, you can call POSIX​::write in the unsafe signal handler to wake up `select` reliably. The code will look like​:

  pipe(my $rfh\, $wfh);   my $wfd = fileno $wfh;   POSIX​::sigaction(SIGTERM\, POSIX​::SigAction->new(sub {   POSIX​::write $wfd\, "1"\, 1;   }));

  my $rbits = '';   vec($rbits\, fileno $rfh\, 1) = 1;   ...   select($rbits\, ...);

This is essentially same as using signaled provided by Linux\, but is more portable.

However (as I believe) in Perl 5.17\, the implementation of the signal handler in the perl core has been altered to _always_ call malloc (via newSVsv).

In other words\, starting from 5.17\, it has become _impossible_ to write an unsafe signal handler without fear of deadlock. And that is what I see as a regression.

I think you're referring to 100c03aa. Possibly an alternative $@​ is the solution here too.

Agreed.

And the fact is that the odds are far more high for malloc to deadlock (than to crash due to other race conditions such as stack relocation).

Leon

-- Kazuho Oku

p5pRT commented 8 years ago

From @Leont

On Fri\, Oct 30\, 2015 at 3​:26 AM\, Kazuho Oku \kazuhooku@&#8203;gmail\.com wrote​:

My understanding is that the introduction of safe signals in Perl 5.8 was to protect people from writing corrupt signal handlers (causing deadlocks or segfaults). But even after that\, Perl has provided freedom to use unsafe signal handlers at one's own risk\, as stated in perldoc perlipc\, and cautiously-coded signal handlers (call only async-signal-safe C functions\, or do nothing at all) have continued to work.

I have a hard time imagining async-signal-safe perl code\, given that stack manipulation isn't guaranteed to be (even if statistically the odds of a stack reallocation are low).

For example\, you can call POSIX​::write in the unsafe signal handler to wake up `select` reliably. The code will look like​:

pipe\(my $rfh\, $wfh\);
my $wfd = fileno $wfh;
POSIX&#8203;::sigaction\(SIGTERM\, POSIX&#8203;::SigAction\->new\(sub \{
    POSIX&#8203;::write $wfd\, "1"\, 1;
\}\)\);

my $rbits = '';
vec\($rbits\, fileno $rfh\, 1\) = 1;
\.\.\.
select\($rbits\, \.\.\.\);

This is essentially same as using signaled provided by Linux\, but is more portable.

A small amount of XS would be much more reliable IMO. You may want to check out my Signal-Pipe distribution at https://github.com/Leont/signal-pipe (not on CPAN yet for lack of documentation and tests\, patches welcome).

Leon

p5pRT commented 8 years ago

From @tonycoz

On Thu Oct 29 19​:27​:04 2015\, kazuhooku@​gmail.com wrote​:

For example\, you can call POSIX​::write in the unsafe signal handler to wake up `select` reliably. The code will look like​:

pipe(my $rfh\, $wfh); my $wfd = fileno $wfh; POSIX​::sigaction(SIGTERM\, POSIX​::SigAction->new(sub { POSIX​::write $wfd\, "1"\, 1;

POSIX​::write() calls sv_newmortal() for its return value\, so there's a small change of a call to malloc() there.

}));

my $rbits = ''; vec($rbits\, fileno $rfh\, 1) = 1; ... select($rbits\, ...);

This is essentially same as using signaled provided by Linux\, but is more portable.

However (as I believe) in Perl 5.17\, the implementation of the signal handler in the perl core has been altered to _always_ call malloc (via newSVsv).

In other words\, starting from 5.17\, it has become _impossible_ to write an unsafe signal handler without fear of deadlock. And that is what I see as a regression.

I think you're referring to 100c03aa. Possibly an alternative $@​ is the solution here too.

Agreed.

Possibly $@​ should be set to a sv_newmortal() instead of a copy of its current value.

Tony

p5pRT commented 5 years ago

From @tonycoz

On Sun\, 08 Nov 2015 19​:43​:06 -0800\, tonyc wrote​:

On Thu Oct 29 19​:27​:04 2015\, kazuhooku@​gmail.com wrote​:

For example\, you can call POSIX​::write in the unsafe signal handler to wake up `select` reliably. The code will look like​:

pipe(my $rfh\, $wfh); my $wfd = fileno $wfh; POSIX​::sigaction(SIGTERM\, POSIX​::SigAction->new(sub { POSIX​::write $wfd\, "1"\, 1;

POSIX​::write() calls sv_newmortal() for its return value\, so there's a small change of a call to malloc() there.

Both for allocating the SV itself and for expanding the tmps stack.

}));

my $rbits = ''; vec($rbits\, fileno $rfh\, 1) = 1; ... select($rbits\, ...);

This is essentially same as using signaled provided by Linux\, but is more portable.

However (as I believe) in Perl 5.17\, the implementation of the signal handler in the perl core has been altered to _always_ call malloc (via newSVsv).

In other words\, starting from 5.17\, it has become _impossible_ to write an unsafe signal handler without fear of deadlock. And that is what I see as a regression.

I think you're referring to 100c03aa. Possibly an alternative $@​ is the solution here too.

Agreed.

Possibly $@​ should be set to a sv_newmortal() instead of a copy of its current value.

That doesn't help\, the CLEAR_ERRSV() in Perl_call_sv() will still allocate a new PV (via SvPVCLEAR() which is a wrapper around Perl_sv_setpv_bufsize()).

Several other macros can end up allocating memory too\, depending on whether the stack involved has enough space or not.

We can't use G_KEEPERR to skip the CLEAR_ERRSV() since the signal handler uses the value of ERRSV to check if the sub died.

The only way I can see to make it work would be to pre-create a SV that we keep in an interpreter global and substitute that into GvSV(PL_errgv) when we call_sv() the signal handler.

This doesn't prevent all of the other potential allocations.

Tony

Tony

p5pRT commented 5 years ago

From @demerphq

On Wed\, 10 Jul 2019 at 03​:48\, Tony Cook via RT \perlbug\-followup@&#8203;perl\.org wrote​:

On Sun\, 08 Nov 2015 19​:43​:06 -0800\, tonyc wrote​:

On Thu Oct 29 19​:27​:04 2015\, kazuhooku@​gmail.com wrote​:

For example\, you can call POSIX​::write in the unsafe signal handler to wake up `select` reliably. The code will look like​:

pipe(my $rfh\, $wfh); my $wfd = fileno $wfh; POSIX​::sigaction(SIGTERM\, POSIX​::SigAction->new(sub { POSIX​::write $wfd\, "1"\, 1;

POSIX​::write() calls sv_newmortal() for its return value\, so there's a small change of a call to malloc() there.

Both for allocating the SV itself and for expanding the tmps stack.

}));

my $rbits = ''; vec($rbits\, fileno $rfh\, 1) = 1; ... select($rbits\, ...);

This is essentially same as using signaled provided by Linux\, but is more portable.

However (as I believe) in Perl 5.17\, the implementation of the signal handler in the perl core has been altered to _always_ call malloc (via newSVsv).

In other words\, starting from 5.17\, it has become _impossible_ to write an unsafe signal handler without fear of deadlock. And that is what I see as a regression.

I think you're referring to 100c03aa. Possibly an alternative $@​ is the solution here too.

Agreed.

Possibly $@​ should be set to a sv_newmortal() instead of a copy of its current value.

That doesn't help\, the CLEAR_ERRSV() in Perl_call_sv() will still allocate a new PV (via SvPVCLEAR() which is a wrapper around Perl_sv_setpv_bufsize()).

Several other macros can end up allocating memory too\, depending on whether the stack involved has enough space or not.

We can't use G_KEEPERR to skip the CLEAR_ERRSV() since the signal handler uses the value of ERRSV to check if the sub died.

The only way I can see to make it work would be to pre-create a SV that we keep in an interpreter global and substitute that into GvSV(PL_errgv) when we call_sv() the signal handler.

Surely this is a reasonable thing to do? Do we need one per thread for threaded builds?

Yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

p5pRT commented 5 years ago

From @tonycoz

On Wed\, 10 Jul 2019 03​:38​:53 -0700\, demerphq wrote​:

On Wed\, 10 Jul 2019 at 03​:48\, Tony Cook via RT \perlbug\-followup@&#8203;perl\.org wrote​:

On Sun\, 08 Nov 2015 19​:43​:06 -0800\, tonyc wrote​:

On Thu Oct 29 19​:27​:04 2015\, kazuhooku@​gmail.com wrote​:

For example\, you can call POSIX​::write in the unsafe signal handler to wake up `select` reliably. The code will look like​:

pipe(my $rfh\, $wfh); my $wfd = fileno $wfh; POSIX​::sigaction(SIGTERM\, POSIX​::SigAction->new(sub { POSIX​::write $wfd\, "1"\, 1;

POSIX​::write() calls sv_newmortal() for its return value\, so there's a small change of a call to malloc() there.

Both for allocating the SV itself and for expanding the tmps stack.

}));

my $rbits = ''; vec($rbits\, fileno $rfh\, 1) = 1; ... select($rbits\, ...);

This is essentially same as using signaled provided by Linux\, but is more portable.

However (as I believe) in Perl 5.17\, the implementation of the signal handler in the perl core has been altered to _always_ call malloc (via newSVsv).

In other words\, starting from 5.17\, it has become _impossible_ to write an unsafe signal handler without fear of deadlock. And that is what I see as a regression.

I think you're referring to 100c03aa. Possibly an alternative $@​ is the solution here too.

Agreed.

Possibly $@​ should be set to a sv_newmortal() instead of a copy of its current value.

That doesn't help\, the CLEAR_ERRSV() in Perl_call_sv() will still allocate a new PV (via SvPVCLEAR() which is a wrapper around Perl_sv_setpv_bufsize()).

Several other macros can end up allocating memory too\, depending on whether the stack involved has enough space or not.

We can't use G_KEEPERR to skip the CLEAR_ERRSV() since the signal handler uses the value of ERRSV to check if the sub died.

The only way I can see to make it work would be to pre-create a SV that we keep in an interpreter global and substitute that into GvSV(PL_errgv) when we call_sv() the signal handler.

Surely this is a reasonable thing to do? Do we need one per thread for threaded builds?

Well\, it would fix one point on non-safety\, for it to actually be safe we'd also need to switch in at least​:

- a pool of SVs to allocate from - preallocated temp\, save and argument stacks

since all of these are used either by the signal -> perl sub delivery code\, or by the C\<\< POSIX​::write $wfd\, "1"\, 1; >> in the suggested signal handler​:

- POSIX​::write() allocates a SV and makes it mortal for its return value (temp stack) - the call to the signal handler sub and to POSIX​::write both use the argument stack - the signal handling code uses the save stack - the signal handling code pushes a new argument stack (possibly using an existing one\, but creating one if needed). This code may race if the non-signal handler code is also pushing a stack

Tony