Perl / perl5

🐪 The Perl programming language
https://dev.perl.org/perl5/
Other
1.96k stars 555 forks source link

Segmentation fault in Perl_csighandler3 when used XS with threads. #22487

Open kni opened 3 months ago

kni commented 3 months ago

Hello.

Description I use perl with libcurl. libcurl use thread for resolver.

When libcurl resolver thread got signal, Perl_csighandler3 crashed because Perl_csighandler3 cannot find perl.

Program terminated with signal SIGSEGV, Segmentation fault.
Address not mapped to object.
#0  0x00000008218c77f6 in Perl_csighandler3 (sig=1, sip=0x0, uap=0x0) at mg.c:1567
1567               (PL_signals & PERL_SIGNALS_UNSAFE_FLAG))
[Current thread is 1 (LWP 121702)]
(gdb) info threads
  Id   Target Id         Frame 
* 1    LWP 121702        0x00000008218c77f6 in Perl_csighandler3 (sig=1, sip=0x0, uap=0x0)
    at mg.c:1567
  2    LWP 100464        0x00000008238c6368 in _write () from /lib/libc.so.7
(gdb) bt
#0  0x00000008218c77f6 in Perl_csighandler3 (sig=1, sip=0x0, uap=0x0) at mg.c:1567
#1  0x00000008218c779a in Perl_csighandler3 (sig=1, sip=0x0, uap=0x8218c779a <Perl_csighandler3+122>)
    at mg.c:1530
#2  0x00000008243b2b60 in ?? () from /lib/libthr.so.3
#3  0x00000008243b211f in ?? () from /lib/libthr.so.3
#4  <signal handler called>
#5  0x00000008238c60ca in _poll () from /lib/libc.so.7
#6  0x000000082389ec98 in __res_nsend () from /lib/libc.so.7
#7  0x000000082386a532 in ?? () from /lib/libc.so.7
#8  0x000000082386a780 in ?? () from /lib/libc.so.7
#9  0x00000008238691b0 in ?? () from /lib/libc.so.7
#10 0x0000000823879acd in nsdispatch () from /lib/libc.so.7
#11 0x00000008238675fd in ?? () from /lib/libc.so.7
#12 0x000000082386719b in getaddrinfo () from /lib/libc.so.7
#13 0x0000000832d29752 in ?? () from /usr/local/lib/libcurl.so.4
#14 0x0000000832d185f6 in ?? () from /usr/local/lib/libcurl.so.4
#15 0x0000000832d2ce0d in ?? () from /usr/local/lib/libcurl.so.4
#16 0x00000008243a9a7a in ?? () from /lib/libthr.so.3
#17 0x0000000000000000 in ?? ()
Backtrace stopped: Cannot access memory at address 0x849053000
(gdb) info local
my_perl = 0x0

To make sure of this, I added the following to Perl_csighandler3 after dTHX:

if (!my_perl) {
  my_perl = PL_curinterp;
  PERL_SET_THX(my_perl);
}

And the error gone missing.

Perl configuration Tested on linux and freebsd: 5.32 (linux), 5.34 (freebsd), 3.38 (freebsd)

Characteristics of this binary (from libperl): 
  Compile-time options:
    DEBUGGING
    HAS_TIMES
    MULTIPLICITY
    PERLIO_LAYERS
    PERL_COPY_ON_WRITE
    PERL_DONT_CREATE_GVSV
    PERL_MALLOC_WRAP
    PERL_OP_PARENT
    PERL_PRESERVE_IVUV
    PERL_TRACK_MEMPOOL
    USE_64_BIT_ALL
    USE_64_BIT_INT
    USE_ITHREADS
    USE_LARGE_FILES
    USE_LOCALE
    USE_LOCALE_COLLATE
    USE_LOCALE_CTYPE
    USE_LOCALE_NUMERIC
    USE_LOCALE_TIME
    USE_PERLIO
    USE_PERL_ATOF
    USE_REENTRANT_API
    USE_THREAD_SAFE_LOCALE
jkeenan commented 3 months ago

Description I use perl with libcurl. libcurl use thread for resolver.

When libcurl resolver thread got signal, Perl_csighandler3 crashed because Perl_csighandler3 cannot find perl.

You will need to provide us with a short Perl program that interacts with libcurl and reproduces this segfault.

You will also need to provide us with the complete output of perl -V for the perl executable you are using to run that program. (You indicate that you have reproduced this problem on more than one operating system. The output of perl -V on any one of them will probably suffice.)

[snip]

To make sure of this, I added the following to Perl_csighandler3 after dTHX:

if (!my_perl) {
  my_perl = PL_curinterp;
  PERL_SET_THX(my_perl);
}

Am I correct in thinking that the above means you wrote a patch like this (working at HEAD of Perl's main development branch, blead), then configured and built a perl executable which you used to successfully run the test program?

diff --git a/mg.c b/mg.c
index d972781ff1..dcf1d98efc 100644
--- a/mg.c
+++ b/mg.c
@@ -1562,6 +1562,10 @@ Perl_csighandler3(int sig, Siginfo_t *sip PERL_UNUSED_DECL, void *uap PERL_UNUSE
     dTHXa(PERL_GET_SIG_CONTEXT);
 #else
     dTHX;
+    if (!my_perl) {
+        my_perl = PL_curinterp;
+        PERL_SET_THX(my_perl);
+    }
 #endif

 #ifdef PERL_USE_3ARG_SIGHANDLER

And the error gone missing.

Am I correct in thinking that when you patched your perl executable and used it to run your test program, the segfault no longer appeared?

Perl configuration Tested on linux and freebsd: 5.32 (linux), 5.34 (freebsd), 3.38 (freebsd)

I'm not sure what you mean by 3.38 (freebsd). Did you intend to say 5.38 (freebsd) (i.e., perl-5.38 on FreeBSD)?

kni commented 3 months ago

yes, this path helped. yes. perl-5.38 on FreeBSD

kni commented 3 months ago
% perl -V
Summary of my perl5 (revision 5 version 36 subversion 3) configuration:

  Platform:
    osname=freebsd
    osvers=13.2-release-p8
    archname=amd64-freebsd-thread-multi
    uname='freebsd test-freebsd 13.2-release-p8 freebsd 13.2-release-p8 generic amd64 '
    config_args='-Accflags=-DUSE_THREAD_SAFE_LOCALE -Darchlib=/usr/local/lib/perl5/5.36/mach -Dcc=cc -Dcf_by=mat -Dcf_email=mat@FreeBSD.org -Dcf_time=Wed Nov 29 18:10:26 EET 2023 -Dinc_version_list=none -Dlibperl=libperl.so.5.36.3 -Dman1dir=/usr/local/lib/perl5/5.36/perl/man/man1 -Dman3dir=/usr/local/lib/perl5/5.36/perl/man/man3 -Dprefix=/usr/local -Dprivlib=/usr/local/lib/perl5/5.36 -Dscriptdir=/usr/local/bin -Dsitearch=/usr/local/lib/perl5/site_perl/mach/5.36 -Dsitelib=/usr/local/lib/perl5/site_perl -Dsiteman1dir=/usr/local/lib/perl5/site_perl/man/man1 -Dsiteman3dir=/usr/local/lib/perl5/site_perl/man/man3 -Dusenm=n -Duseshrplib -sde -Ui_iconv -Ui_malloc -Uinstallusrbinperl -Alddlflags=-L/usr/ports/lang/perl5.36/work/perl-5.36.3 -L/usr/local/lib/perl5/5.36/mach/CORE -lperl -Dshrpldflags=$(LDDLFLAGS:N-L/usr/ports/lang/perl5.36/work/perl-5.36.3:N-L/usr/local/lib/perl5/5.36/mach/CORE:N-lperl) -Wl,-soname,$(LIBPERL:R) -DDEBUGGING -Doptimize=-g -Dusedtrace -Ui_gdbm -Dusemultiplicity=y -Duse64bitint -Dusemymalloc=n -Dusethreads=y'
    hint=recommended
    useposix=true
    d_sigaction=define
    useithreads=define
    usemultiplicity=define
    use64bitint=define
    use64bitall=define
    uselongdouble=undef
    usemymalloc=n
    default_inc_excludes_dot=define
  Compiler:
    cc='cc'
    ccflags ='-DHAS_FPSETMASK -DHAS_FLOATINGPOINT_H -DUSE_THREAD_SAFE_LOCALE -DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include'
    optimize='-g'
    cppflags='-DHAS_FPSETMASK -DHAS_FLOATINGPOINT_H -DUSE_THREAD_SAFE_LOCALE -DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include'
    ccversion=''
    gccversion='FreeBSD Clang 14.0.5 (https://github.com/llvm/llvm-project.git llvmorg-14.0.5-0-gc12386ae247c)'
    gccosandvers=''
    intsize=4
    longsize=8
    ptrsize=8
    doublesize=8
    byteorder=12345678
    doublekind=3
    d_longlong=define
    longlongsize=8
    d_longdbl=define
    longdblsize=16
    longdblkind=3
    ivtype='long'
    ivsize=8
    nvtype='double'
    nvsize=8
    Off_t='off_t'
    lseeksize=8
    alignbytes=8
    prototype=define
  Linker and Libraries:
    ld='cc'
    ldflags ='-pthread -Wl,-E  -fstack-protector-strong -L/usr/local/lib'
    libpth=/usr/lib /usr/local/lib /usr/lib/clang/14.0.5/lib
    libs=-lgdbm -ldl -lm -lcrypt -lutil
    perllibs=-ldl -lm -lcrypt -lutil
    libc=
    so=so
    useshrplib=true
    libperl=libperl.so.5.36.3
    gnulibc_version=''
  Dynamic Linking:
    dlsrc=dl_dlopen.xs
    dlext=so
    d_dlsymun=undef
    ccdlflags='  -Wl,-R/usr/local/lib/perl5/5.36/mach/CORE'
    cccdlflags='-DPIC -fPIC'
    lddlflags='-shared  -L/usr/local/lib/perl5/5.36/mach/CORE -lperl -L/usr/local/lib -fstack-protector-strong'

Characteristics of this binary (from libperl): 
  Compile-time options:
    DEBUGGING
    HAS_TIMES
    MULTIPLICITY
    PERLIO_LAYERS
    PERL_COPY_ON_WRITE
    PERL_DONT_CREATE_GVSV
    PERL_MALLOC_WRAP
    PERL_OP_PARENT
    PERL_PRESERVE_IVUV
    PERL_TRACK_MEMPOOL
    USE_64_BIT_ALL
    USE_64_BIT_INT
    USE_ITHREADS
    USE_LARGE_FILES
    USE_LOCALE
    USE_LOCALE_COLLATE
    USE_LOCALE_CTYPE
    USE_LOCALE_NUMERIC
    USE_LOCALE_TIME
    USE_PERLIO
    USE_PERL_ATOF
    USE_REENTRANT_API
    USE_THREAD_SAFE_LOCALE
  Built under freebsd
  %ENV:
    PERL5LIB="/home/nick/lib:/home/nick/perl5/lib/perl5"
  @INC:
    .
    opt
    /home/nick/lib
    /home/nick/perl5/lib/perl5/amd64-freebsd-thread-multi
    /home/nick/perl5/lib/perl5
    /usr/local/lib/perl5/site_perl/mach/5.36
    /usr/local/lib/perl5/site_perl
    /usr/local/lib/perl5/5.36/mach
    /usr/local/lib/perl5/5.36
jkeenan commented 3 months ago

% perl -V
Summary of my perl5 (revision 5 version 36 subversion 3) configuration:

  Platform:
    osname=freebsd
    osvers=13.2-release-p8
    archname=amd64-freebsd-thread-multi
    uname='freebsd test-freebsd 13.2-release-p8 freebsd 13.2-release-p8 generic amd64 '
    config_args='-Accflags=-DUSE_THREAD_SAFE_LOCALE -Darchlib=/usr/local/lib/perl5/5.36/mach -Dcc=cc -Dcf_by=mat -Dcf_email=mat@FreeBSD.org -Dcf_time=Wed Nov 29 18:10:26 EET 2023 -Dinc_version_list=none -Dlibperl=libperl.so.5.36.3 -Dman1dir=/usr/local/lib/perl5/5.36/perl/man/man1 -Dman3dir=/usr/local/lib/perl5/5.36/perl/man/man3 -Dprefix=/usr/local -Dprivlib=/usr/local/lib/perl5/5.36 -Dscriptdir=/usr/local/bin -Dsitearch=/usr/local/lib/perl5/site_perl/mach/5.36 -Dsitelib=/usr/local/lib/perl5/site_perl -Dsiteman1dir=/usr/local/lib/perl5/site_perl/man/man1 -Dsiteman3dir=/usr/local/lib/perl5/site_perl/man/man3 -Dusenm=n -Duseshrplib -sde -Ui_iconv -Ui_malloc -Uinstallusrbinperl -Alddlflags=-L/usr/ports/lang/perl5.36/work/perl-5.36.3 -L/usr/local/lib/perl5/5.36/mach/CORE -lperl -Dshrpldflags=$(LDDLFLAGS:N-L/usr/ports/lang/perl5.36/work/perl-5.36.3:N-L/usr/local/lib/perl5/5.36/mach/CORE:N-lperl) -Wl,-soname,$(LIBPERL:R) -DDEBUGGING -Doptimize=-g -Dusedtrace -Ui_gdbm -Dusemultiplicity=y -Duse64bitint -Dusemymalloc=n -Dusethreads=y'

So, am I correct in thinking that you were simply using the "vendor perl" (/usr/local/bin/perl) on FreeBSD in running your test program?

If so, then we really need to see a short perl program that interacts with libcurl and displays the segmentation fault you have reported.

Leont commented 3 months ago

The problem here is that thread handlers are process wide, but the handler perl installs assumes that it will run in a Perl thread. Evidently it doesn't here.

The easiest way around this is to mask the signal (in this case SIGHUP) in the CURL thread.

kni commented 3 months ago

The problem here is that thread handlers are process wide, but the handler perl installs assumes that it will run in a Perl thread. Evidently it doesn't here.

Yes.

The easiest way around this is to mask the signal (in this case SIGHUP) in the CURL thread.

But I do not have access to CURL resolver thread. And this problem is not only for CURL, but for any lib with threads.

jkeenan commented 3 months ago

The problem here is that thread handlers are process wide, but the handler perl installs assumes that it will run in a Perl thread. Evidently it doesn't here.

Yes.

The easiest way around this is to mask the signal (in this case SIGHUP) in the CURL thread.

But I do not have access to CURL resolver thread. And this problem is not only for CURL, but for any lib with threads.

Well, can you give us any program which illustrates the problem? We can't begin to fix a segfault if we don't know how to reproduce it.

tonycoz commented 3 months ago

Well, can you give us any program which illustrates the problem? We can't begin to fix a segfault if we don't know how to reproduce it.

It's an interaction between perl's signal handler and a thread created by a library which doesn't have the thread local storage set up that the perl signal handler expects.

It won't be reproducible with just core perl.

I suspect the fix for this is "don't use signals", or at least use sigwait() or something similar to deal with them.

The OP is seeing this from libcurl, but it could happen for any external library that uses threads, or for an XS module using its own threads, or using OpenMP for threading.

But if you use some C (requires Inline::C):

use v5.36;
use Inline C => <<'EOC';
#include <pthread.h>
#include <time.h>
#include <signal.h>

void *
do_thread(void *p) {
  raise(SIGUSR1);
  return NULL;
}

IV
start_thread() {
  pthread_t th;
  if (pthread_create(&th, NULL, do_thread, NULL) == 0) {
    /* not portable */
    return (IV)th;
  }
  else {
    return -1;
  }
}

void
join_thread(IV th) {
  void **ret = NULL;
  pthread_join((pthread_t)th, &ret);
}

EOC

$SIG{USR1} = sub { say "An interrupt"; };
my $th = start_thread();

join_thread($th);

Run it under the debugger, this is my system perl:

$ gdb --args `which perl` ../22487.pl
GNU gdb (Debian 13.1-3) 13.1
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/bin/perl...
(No debugging symbols found in /usr/bin/perl)
(gdb) handle SIGUSR1 noprint nostop pass
Signal        Stop      Print   Pass to program Description
SIGUSR1       No        No      Yes             User defined signal 1
(gdb) r
Starting program: /usr/bin/perl ../22487.pl
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7fffea3ff6c0 (LWP 3058992)]

Thread 2 "perl" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffea3ff6c0 (LWP 3058992)]
0x000055555565a4c0 in Perl_csighandler3 ()
(gdb)
kni commented 3 months ago

tonycoz, thank you very much! You recreated the problem so beautifully.

tonycoz commented 2 months ago

You recreated the problem so beautifully.

It took me a few tries to get the signal to deliver to the new thread, my knowledge of POSIX signals is really basic.

I first tried to make it loop and catch SIGINT, but that would always get delivered to the parent thread (the signal didn't appear to be masked).

I then tried kill(getpid(), SIGINT) but that also got delivered only to the parent thread.

Then I noticed raise() and it worked.