Perl / perl5

🐪 The Perl programming language
https://dev.perl.org/perl5/
Other
1.85k stars 523 forks source link

perl segfaults unpredictable with valid code (Cookbook:p570,571 cmd3sel) concerns presumably a r #4152

Closed p5pRT closed 20 years ago

p5pRT commented 22 years ago

Migrated from rt.perl.org#7205 (status was 'resolved')

Searchable as RT7205$

p5pRT commented 22 years ago

From herbert.wengatz@mchr2.siemens.de

We have here a rather nasty problem which we could reproduce with different perl-versions(5.6.1 and 5.00503) and on different unix-like and unix operating systems (HP-UX 11\, Solaris 2.6 (worst)\, SunOS 4.1.3 and Linux with Kernel 2.2.14).

We are developing some software for use as administration tools for huge unix networks\, so we have a big interest in our programs to run stable and predictable (the faintest buglet may end in up in expensive havoc).

We want to create a routine that is able to handle external commands in a safe way and we want to communicate with it. Thus we took as a basis the example from the Perl Cookbook (ORA)\, chapter 16.9\, on pages 570 and 571 (cmd3sel).

We extended the example a little bit and at first we noticed a strange behaviour because sometimes it wrote back the informations of the died child (which it should) and sometimes it didn't. - Since we have to rely on what we receive there\, we investigated some more and ended up with a example-script which works mostly the way it should\, but sometimes everything breaks with segmentation violations and even some other unexpected error messages from inside perl (see below or try on your own).

We also found out that the errors occur more often when the system load is higher.

Here is our code (we tried to reduce it as much as we could\, and you may quite well recognize the code from cmd3sel)​:

----------->8--- cut here ----8\<-------- #!/usr/local/bin/perl -T -w

use IO​::Select; use IPC​::Open3;

delete @​ENV{qw{IFS CDPATH ENV BASH_ENV PATH}};

# repeat 500 times to really show the effect

for($i = 0 ; $i \< 500 ; $i++) { @​io_channel = ();

  # since we called this script 'bug'\, the line below   # will produce output on both\, STDOUT and STDERR ('xxx' doesn't   # exist).

  &system_redirect(\@​io_channel\,"/bin/ls -l bug xxx");

  print "STDOUT was​: "\,$io_channel[1]\,"\n";   print "STDERR was​: "\,$io_channel[2]\,"\n"; }

############################################################################### # system_redirect() ############################################################################### sub system_redirect() { my($ra_io_channel\,@​cmd) = @​_;

  local $exitstatus = '?';

  my $pid = open3(*CMD_IN\,*CMD_OUT\,*CMD_ERR\,@​cmd);

  $SIG{CHLD} = sub   { if(waitpid($pid\,0) > 0)   { printf("exitstatus of child​: %d\n."\,$?);   }   $exitstatus = $?;   };

  if(defined $ra_io_channel->[0])   { print CMD_IN $ra_io_channel->[0];   }   close(CMD_IN);

  my $selector = IO​::Select->new();   $selector->add(*CMD_ERR\,*CMD_OUT);

  while(@​ready = $selector->can_read)   { foreach $filehandle (@​ready)   { if(fileno($filehandle) == fileno(CMD_ERR))   { $ra_io_channel->[2] .= \<CMD_ERR>;   }   else   { $ra_io_channel->[1] .= \<CMD_OUT>;   }

  if(eof($filehandle))   { $selector->remove($filehandle);   }   }   }

  close(CMD_OUT);   close(CMD_ERR);

  return($exitstatus); }

__END__ # # The code above\, when run in a loop on the commandline (bourne-shell or bash) # like this (remember\, the script was called 'bug' here)​:

i=0 ; while [ $i -lt 50 ] ; do ./bug | grep STDERR | wc ; i=`expr $i + 1` ;done

# # produces\, for example\, the following output​: #   500 4500 26000   99 891 5148 Segmentation fault   500 4500 26000   500 4500 26000   500 4500 26000   500 4500 26000   500 4500 26000   500 4500 26000   211 1899 10972 Segmentation fault   434 3906 22568   26 234 1352 Segmentation fault   48 432 2496 Segmentation fault Use of uninitialized value in scalar assignment at ./bug line 30. Use of uninitialized value in scalar assignment at ./bug line 30. Unable to create sub named "" at ./bug line 30.   274 2466 14248   500 4500 26000 Attempt to free unreferenced scalar at ./bug line 30.   289 2601 15028 Segmentation fault   500 4500 26000 ----------->8--- cut here ----8\<-------- The example output was generated with perl 5.6.1 under Linux 2.2.14\, but this was the *best* constellation we could find until now. The machine is a Pentium III 650 MHz with 128 MB RAM and is otherwise running without any errors. Besides\, we got almost the same behaviour on all systems we tested (see above). So this can't be neither a CPU- nor machine- nor OS- depending bug\, but must be something that lurks somewhere deep in perl.

We can only guess that the open3-child dies before the anonymus sub can catch the signal. And\, I want to mention it again\, the rate of errors is increasing dramatically when the systems load is increased. (This points towards a race condition somewhere in between "open" and "waitpid".)

The script above may run quite fine on the above mentioned PIII system\, but as soon as I move the mouse or open another xterm\, the rate of segfaults rises dramatically. Just try on your own.

We are very sad if we can't use IPC​:Open3 because of this\, but it is currently absolutely unreliable and thus unacceptable for sysadmin tasks. We will also have severe troubles in finding something more reliable\, since all we could do\, is only re-implement IPC​::Open3.

I guess Tom and Nathan will have a high interest in fixing this\, because the base for it is published in their Cookbook (which is otherwise excellent!) and everybody and his uncle may run in the same problem we did.

Do you know of this already? Is this something that can be found in an FAQ (I guess not\, otherwise you wouldn't have published the basic code for this in the Cookbook...)?

Please inform us when you plan to fix it (if at all) and I hope you inform us when you fixed it. BTW we are willing to serve as beta-testers for this.

Best regards\,

  Herbert

PS​: Please send special greetings to Tom Christiansen\, whom I had the luck to meet in person during a perl training he held a couple of years ago in Munich (about 1996). :) I'm the guy who worked for TSR here in germany\, too. (I guess he can't remember\, but that's no blame for him... :) )

-- Herbert Wengatz Phone MchP​: +49 (0)89 / 636 - 47677 I&S IT PS 8 Phone MchH​: +49 (0)89 / 722 - 49296 Siemens AG Mobile : +49 (0)160 / 8 85 16 85 Otto Hahn Ring 6 Fax MchP​: +49 (0)89 / 636 - 47586 81738 Muenchen mailto​:herbert.wengatz@​mchr2.siemens.de   http​://www.mvn-services.com



Flags​:   category=core   severity=critical


Site configuration information for perl v5.6.1​:

Configured by hwe at Thu Jun 21 10​:08​:59 MEST 2001.

Summary of my perl5 (revision 5.0 version 6 subversion 1) configuration​:   Platform​:   osname=linux\, osvers=2.2.14\, archname=i686-linux   uname='linux elrond 2.2.14 #3 mon jan 29 13​:47​:05 cet 2001 i686 unknown '   config_args='-de'   hint=recommended\, useposix=true\, d_sigaction=define   usethreads=undef use5005threads=undef useithreads=undef usemultiplicity=undef   useperlio=undef d_sfio=undef uselargefiles=define usesocks=undef   use64bitint=undef use64bitall=undef uselongdouble=undef   Compiler​:   cc='cc'\, ccflags ='-fno-strict-aliasing -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64'\,   optimize='-O2'\,   cppflags='-fno-strict-aliasing -I/usr/local/include'   ccversion=''\, gccversion='2.95.2 19991024 (release)'\, gccosandvers=''   intsize=4\, longsize=4\, ptrsize=4\, doublesize=8\, byteorder=1234   d_longlong=define\, longlongsize=8\, d_longdbl=define\, longdblsize=12   ivtype='long'\, ivsize=4\, nvtype='double'\, nvsize=8\, Off_t='off_t'\, lseeksize=8   alignbytes=4\, usemymalloc=n\, prototype=define   Linker and Libraries​:   ld='cc'\, ldflags =' -L/usr/local/lib'   libpth=/usr/local/lib /lib /usr/lib   libs=-lnsl -lndbm -lgdbm -ldbm -ldb -ldl -lm -lc -lposix -lcrypt -lutil   perllibs=-lnsl -ldl -lm -lc -lposix -lcrypt -lutil   libc=\, so=so\, useshrplib=false\, libperl=libperl.a   Dynamic Linking​:   dlsrc=dl_dlopen.xs\, dlext=so\, d_dlsymun=undef\, ccdlflags='-rdynamic'   cccdlflags='-fpic'\, lddlflags='-shared -L/usr/local/lib'

Locally applied patches​:


@​INC for perl v5.6.1​:   /usr/local/lib/perl5/5.6.1/i686-linux   /usr/local/lib/perl5/5.6.1   /usr/local/lib/perl5/site_perl/5.6.1/i686-linux   /usr/local/lib/perl5/site_perl/5.6.1   /usr/local/lib/perl5/site_perl   .


Environment for perl v5.6.1​:   HOME=/home/hwe   LANG=de_DE   LANGUAGE (unset)   LC_COLLATE=POSIX   LD_LIBRARY_PATH (unset)   LOGDIR (unset)  
PATH=/home/hwe/bin​:/usr/local/bin​:/usr/bin/mh​:/opt/kde/bin​:/sbin​:/bin​:/usr/sbin​:/usr/bin​:/usr/X11R6/bin   PERL_BADLANG (unset)   SHELL=/bin/bash

p5pRT commented 22 years ago

From [Unknown Contact. See original ticket]

We have here a rather nasty problem which we could reproduce with different perl-versions(5.6.1 and 5.00503) and on different unix-like and unix operating systems (HP-UX 11\, Solaris 2.6 (worst)\, SunOS 4.1.3 and Linux with Kernel 2.2.14).

We are developing some software for use as administration tools for huge unix networks\, so we have a big interest in our programs to run stable and predictable (the faintest buglet may end in up in expensive havoc).

We want to create a routine that is able to handle external commands in a safe way and we want to communicate with it. Thus we took as a basis the example from the Perl Cookbook (ORA)\, chapter 16.9\, on pages 570 and 571 (cmd3sel).

We extended the example a little bit and at first we noticed a strange behaviour because sometimes it wrote back the informations of the died child (which it should) and sometimes it didn't. - Since we have to rely on what we receive there\, we investigated some more and ended up with a example-script which works mostly the way it should\, but sometimes everything breaks with segmentation violations and even some other unexpected error messages from inside perl (see below or try on your own).

We also found out that the errors occur more often when the system load is higher.

Here is our code (we tried to reduce it as much as we could\, and you may quite well recognize the code from cmd3sel)​:

----------->8--- cut here ----8\<-------- #!/usr/local/bin/perl -T -w

use IO​::Select; use IPC​::Open3;

delete @​ENV{qw{IFS CDPATH ENV BASH_ENV PATH}};

# repeat 500 times to really show the effect

for($i = 0 ; $i \< 500 ; $i++) { @​io_channel = ();

  # since we called this script 'bug'\, the line below   # will produce output on both\, STDOUT and STDERR ('xxx' doesn't   # exist).

  &system_redirect(\@​io_channel\,"/bin/ls -l bug xxx");

  print "STDOUT was​: "\,$io_channel[1]\,"\n";   print "STDERR was​: "\,$io_channel[2]\,"\n"; }

############################################################################### # system_redirect() ############################################################################### sub system_redirect() { my($ra_io_channel\,@​cmd) = @​_;

  local $exitstatus = '?';

  my $pid = open3(*CMD_IN\,*CMD_OUT\,*CMD_ERR\,@​cmd);

  $SIG{CHLD} = sub   { if(waitpid($pid\,0) > 0)   { printf("exitstatus of child​: %d\n."\,$?);   }   $exitstatus = $?;   };

  if(defined $ra_io_channel->[0])   { print CMD_IN $ra_io_channel->[0];   }   close(CMD_IN);

  my $selector = IO​::Select->new();   $selector->add(*CMD_ERR\,*CMD_OUT);

  while(@​ready = $selector->can_read)   { foreach $filehandle (@​ready)   { if(fileno($filehandle) == fileno(CMD_ERR))   { $ra_io_channel->[2] .= \<CMD_ERR>;   }   else   { $ra_io_channel->[1] .= \<CMD_OUT>;   }

  if(eof($filehandle))   { $selector->remove($filehandle);   }   }   }

  close(CMD_OUT);   close(CMD_ERR);

  return($exitstatus); }

__END__ # # The code above\, when run in a loop on the commandline (bourne-shell or bash) # like this (remember\, the script was called 'bug' here)​:

i=0 ; while [ $i -lt 50 ] ; do ./bug | grep STDERR | wc ; i=`expr $i + 1` ;done

# # produces\, for example\, the following output​: #   500 4500 26000   99 891 5148 Segmentation fault   500 4500 26000   500 4500 26000   500 4500 26000   500 4500 26000   500 4500 26000   500 4500 26000   211 1899 10972 Segmentation fault   434 3906 22568   26 234 1352 Segmentation fault   48 432 2496 Segmentation fault Use of uninitialized value in scalar assignment at ./bug line 30. Use of uninitialized value in scalar assignment at ./bug line 30. Unable to create sub named "" at ./bug line 30.   274 2466 14248   500 4500 26000 Attempt to free unreferenced scalar at ./bug line 30.   289 2601 15028 Segmentation fault   500 4500 26000 ----------->8--- cut here ----8\<-------- The example output was generated with perl 5.6.1 under Linux 2.2.14\, but this was the *best* constellation we could find until now. The machine is a Pentium III 650 MHz with 128 MB RAM and is otherwise running without any errors. Besides\, we got almost the same behaviour on all systems we tested (see above). So this can't be neither a CPU- nor machine- nor OS- depending bug\, but must be something that lurks somewhere deep in perl.

We can only guess that the open3-child dies before the anonymus sub can catch the signal. And\, I want to mention it again\, the rate of errors is increasing dramatically when the systems load is increased. (This points towards a race condition somewhere in between "open" and "waitpid".)

The script above may run quite fine on the above mentioned PIII system\, but as soon as I move the mouse or open another xterm\, the rate of segfaults rises dramatically. Just try on your own.

We are very sad if we can't use IPC​:Open3 because of this\, but it is currently absolutely unreliable and thus unacceptable for sysadmin tasks. We will also have severe troubles in finding something more reliable\, since all we could do\, is only re-implement IPC​::Open3.

I guess Tom and Nathan will have a high interest in fixing this\, because the base for it is published in their Cookbook (which is otherwise excellent!) and everybody and his uncle may run in the same problem we did.

Do you know of this already? Is this something that can be found in an FAQ (I guess not\, otherwise you wouldn't have published the basic code for this in the Cookbook...)?

Please inform us when you plan to fix it (if at all) and I hope you inform us when you fixed it. BTW we are willing to serve as beta-testers for this.

Best regards\,

  Herbert

PS​: Please send special greetings to Tom Christiansen\, whom I had the luck to meet in person during a perl training he held a couple of years ago in Munich (about 1996). :) I'm the guy who worked for TSR here in germany\, too. (I guess he can't remember\, but that's no blame for him... :) )

-- Herbert Wengatz Phone MchP​: +49 (0)89 / 636 - 47677 I&S IT PS 8 Phone MchH​: +49 (0)89 / 722 - 49296 Siemens AG Mobile : +49 (0)160 / 8 85 16 85 Otto Hahn Ring 6 Fax MchP​: +49 (0)89 / 636 - 47586 81738 Muenchen mailto​:herbert.wengatz@​mchr2.siemens.de   http​://www.mvn-services.com



Flags​:   category=core   severity=critical


Site configuration information for perl v5.6.1​:

Configured by hwe at Thu Jun 21 10​:08​:59 MEST 2001.

Summary of my perl5 (revision 5.0 version 6 subversion 1) configuration​:   Platform​:   osname=linux\, osvers=2.2.14\, archname=i686-linux   uname='linux elrond 2.2.14 #3 mon jan 29 13​:47​:05 cet 2001 i686 unknown '   config_args='-de'   hint=recommended\, useposix=true\, d_sigaction=define   usethreads=undef use5005threads=undef useithreads=undef usemultiplicity=undef   useperlio=undef d_sfio=undef uselargefiles=define usesocks=undef   use64bitint=undef use64bitall=undef uselongdouble=undef   Compiler​:   cc='cc'\, ccflags ='-fno-strict-aliasing -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64'\,   optimize='-O2'\,   cppflags='-fno-strict-aliasing -I/usr/local/include'   ccversion=''\, gccversion='2.95.2 19991024 (release)'\, gccosandvers=''   intsize=4\, longsize=4\, ptrsize=4\, doublesize=8\, byteorder=1234   d_longlong=define\, longlongsize=8\, d_longdbl=define\, longdblsize=12   ivtype='long'\, ivsize=4\, nvtype='double'\, nvsize=8\, Off_t='off_t'\, lseeksize=8   alignbytes=4\, usemymalloc=n\, prototype=define   Linker and Libraries​:   ld='cc'\, ldflags =' -L/usr/local/lib'   libpth=/usr/local/lib /lib /usr/lib   libs=-lnsl -lndbm -lgdbm -ldbm -ldb -ldl -lm -lc -lposix -lcrypt -lutil   perllibs=-lnsl -ldl -lm -lc -lposix -lcrypt -lutil   libc=\, so=so\, useshrplib=false\, libperl=libperl.a   Dynamic Linking​:   dlsrc=dl_dlopen.xs\, dlext=so\, d_dlsymun=undef\, ccdlflags='-rdynamic'   cccdlflags='-fpic'\, lddlflags='-shared -L/usr/local/lib'

Locally applied patches​:


@​INC for perl v5.6.1​:   /usr/local/lib/perl5/5.6.1/i686-linux   /usr/local/lib/perl5/5.6.1   /usr/local/lib/perl5/site_perl/5.6.1/i686-linux   /usr/local/lib/perl5/site_perl/5.6.1   /usr/local/lib/perl5/site_perl   .


Environment for perl v5.6.1​:   HOME=/home/hwe   LANG=de_DE   LANGUAGE (unset)   LC_COLLATE=POSIX   LD_LIBRARY_PATH (unset)   LOGDIR (unset)  
PATH=/home/hwe/bin​:/usr/local/bin​:/usr/bin/mh​:/opt/kde/bin​:/sbin​:/bin​:/usr/sbin​:/usr/bin​:/usr/X11R6/bin   PERL_BADLANG (unset)   SHELL=/bin/bash

p5pRT commented 22 years ago

From @schwern

After trying with bleadperl 12768 on both machines\, everything worked aok.

Bug closed.

PS I'm trying the mail interface for the first time\, so this should be "add a note\, make the bug severity fatal and close it"

--

Michael G. Schwern \schwern@&#8203;pobox\.com http​://www.pobox.com/~schwern/ Perl6 Quality Assurance \perl\-qa@&#8203;perl\.org Kwalitee Is Job One Death? Its like being on holiday with a group of Germans.