Perl / perl5

🐪 The Perl programming language
https://dev.perl.org/perl5/
Other
1.91k stars 542 forks source link

Data::Dumper + threads + recursion = segfault #6782

Closed p5pRT closed 18 years ago

p5pRT commented 21 years ago

Migrated from rt.perl.org#23965 (status was 'resolved')

Searchable as RT23965$

p5pRT commented 21 years ago

From DavidBuckley@bigfoot.com

Created by DavidBuckley@bigfoot.com

\<Apologies if this is duplicated; the mail daemon on my other machine isn't behaving itself. I don't have a confirmation of the original\, and it's not up on the site.>

The following code illustrates the issue nicely​:

use threads; use threads​::shared; use Data​::Dumper; our $a : shared; $a = \$a; print Data​::Dumper​::Dumper( $a );

This produces a short pause\, then a segfault. I'm guessing it's due to issues with the way references to shared objects are implemented (at least on Linux)​:

use threads; use threads​::shared; our $a : shared = &share( [] ); print $a; print $a;

Gives​:

ARRAY(0x813bbc4)ARRAY(0x813babc)

Which\, of course\, is going to confuse anything that checks cross-references using the reference itself.

There appears to be no obvious way of checking equivalence of shared references; presumably they're implemented in such a way that they're not really references to a specific object\, but floaty references to the shared version. I'm currently avoiding the segfault by specifying a maximum recursive depth.

Perl Info ``` Flags: category=core severity=high This perlbug was built using Perl v5.8.0 - Sun Jun 16 02:13:30 BST 2002 It is being executed now by Perl v5.8.0 - Thu Sep 11 20:57:21 EST 2003. Site configuration information for perl v5.8.0: Configured by Debian Project at Thu Sep 11 20:57:21 EST 2003. Summary of my perl5 (revision 5.0 version 8 subversion 0) configuration: Platform: osname=linux, osvers=2.4.21-xfs+ti1211, archname=i386-linux-thread-multi uname='linux kosh 2.4.21-xfs+ti1211 #1 sat jul 12 10:35:04 est 2003 i686 gnulinux ' config_args='-Dusethreads -Duselargefiles -Dccflags=-DDEBIAN -Dcccdlflags=-fPIC -Darchname=i386-linux -Dprefix=/usr -Dprivlib=/usr/share/perl/5.8.0 -Darchlib=/usr/lib/perl/5.8.0 -Dvendorprefix=/usr -Dvendorlib=/usr/share/perl5 -Dvendorarch=/usr/lib/perl5 -Dsiteprefix=/usr/local -Dsitelib=/usr/local/share/perl/5.8.0 -Dsitearch=/usr/local/lib/perl/5.8.0 -Dman1dir=/usr/share/man/man1 -Dman3dir=/usr/share/man/man3 -Dman1ext=1 -Dman3ext=3perl -Dpager=/usr/bin/sensible-pager -Uafs -Ud_csh -Uusesfio -Uusenm -Duseshrplib -Dlibperl=libperl.so.5.8.0 -Dd_dosuid -des' hint=recommended, useposix=true, d_sigaction=define usethreads=define use5005threads=undef useithreads=define usemultiplicity=define useperlio=define d_sfio=undef uselargefiles=define usesocks=undef use64bitint=undef use64bitall=undef uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBIAN -fno-strict-aliasing -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64', optimize='-O3', cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBIAN -fno-strict-aliasing' ccversion='', gccversion='3.3.2 20030908 (Debian prerelease)', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12 ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=4, prototype=define Linker and Libraries: ld='cc', ldflags =' -L/usr/local/lib' libpth=/usr/local/lib /lib /usr/lib libs=-lgdbm -ldb -ldl -lm -lpthread -lc -lcrypt perllibs=-ldl -lm -lpthread -lc -lcrypt libc=/lib/libc-2.3.2.so, so=so, useshrplib=true, libperl=libperl.so.5.8.0 gnulibc_version='2.3.2' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-rdynamic' cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib' Locally applied patches: @INC for perl v5.8.0: /etc/perl /usr/local/lib/perl/5.8.0 /usr/local/share/perl/5.8.0 /usr/lib/perl5 /usr/share/perl5 /usr/lib/perl/5.8.0 /usr/share/perl/5.8.0 /usr/local/lib/site_perl . Environment for perl v5.8.0: HOME=/home/bucko LANG=C LANGUAGE (unset) LD_LIBRARY_PATH (unset) LOGDIR (unset) PATH=/home/bucko/bin:/usr/local/jdk1.2.2/bin:/usr/local/bin:/usr/bin:/bin:/usr/bin/X11:/usr/games PERL_BADLANG (unset) SHELL=/bin/bash ```
p5pRT commented 21 years ago

From perl_dummy@bloodgate.com

-----BEGIN PGP SIGNED MESSAGE-----

Moin\,

# New Ticket Created by David Buckley # Please include the string​: [perl #23965] # in the subject line of all future correspondence about this issue. # \<URL​: http​://rt.perl.org/rt2/Ticket/Display.html?id=23965 >

use threads; use threads​::shared; our $a : shared = &share( [] ); print $a; print $a;

Gives​:

ARRAY(0x813bbc4)ARRAY(0x813babc)

Interesting\, does it give a different value each time you print it\, or does it alternate it between these two?

Best wishes\,

Tels

- -- Signed on Tue Sep 23 06​:27​:05 2003 with key 0x93B84C15. Visit my photo gallery at http​://bloodgate.com/photos/ PGP key on http​://bloodgate.com/tels.asc or per email.

"Where shall I put you? Under H\, like Hot\, Sexy Mama?"

-----BEGIN PGP SIGNATURE----- Version​: GnuPG v1.2.2-rc1-SuSE (GNU/Linux) Comment​: When cryptography is outlawed\, bayl bhgynjf jvyy unir cevinpl.

iQEVAwUBP2/L3ncLPEOTuEwVAQECkQf+Jsi/3oiKKK4uox1Ckm/PUypxW+UuU6cI iymStV9uEQcBTZ6JtYJD+ueEKhzCTNyj0emdpGpr2RT1zTmyg9Vy/V9Lmsm9AUIH raKL9sr/3WYfralXV6e9omIoOohdumGHa1yetJXGy+ZF6hLiGiDIBiiuIklkAHzW bL25FYUjZtAi4MYCKqgvi6+s6zw2G8uP3PEYYQNp0J/9X6TAoc2SA0cE5P74kHgu lNBlt9i5R2/nQ8pzI1EW41wegK0OSLpDvnbP9wh+21tnD+YFLxbjO8MWIxJW1AgD XivI8gWXxOoMTWDUbuxCYJZqgTQwxmMmJIitGeVPRWCV4o9Bna6x8A== =t5u7 -----END PGP SIGNATURE-----

p5pRT commented 21 years ago

From DavidBuckley@bigfoot.com

On 23 Sep 2003\, Tels wrote​:

-----BEGIN PGP SIGNED MESSAGE-----

Moin\,

# New Ticket Created by David Buckley # Please include the string​: [perl #23965] # in the subject line of all future correspondence about this issue. # \<URL​: http​://rt.perl.org/rt2/Ticket/Display.html?id=23965 >

use threads; use threads​::shared; our $a : shared = &share( [] ); print $a; print $a;

Gives​:

ARRAY(0x813bbc4)ARRAY(0x813babc)

Interesting\, does it give a different value each time you print it\, or does it alternate it between these two?

\

It seems to just alternate. Different threads can\, however\, appear to read different vales\, as shown here​:

[11​:34​:29] bucko(tank) ~$ perl -Mthreads -Mthreads​::shared -e 'my $a : shared = &share({}); for(1..2){async{for(1..5) {print threads->tid()."$a\n"}}}sleep 2' 1HASH(0x8272960) 1HASH(0x8203778) 1HASH(0x8272960) 1HASH(0x8203778) 1HASH(0x8272960) 2HASH(0x82dabd8) 2HASH(0x827b3c0) 2HASH(0x82dabd8) 2HASH(0x827b3c0) 2HASH(0x82dabd8)

(Apologies for messy code)

My original code was using 5 processing threads\, hence giving a selection of 10 values (presumably)\, which is what gave me the impression it was just allocating a new one each time.

Oddly enough\, this kinda crushes my theory on Data​::Dumper hitting an infinate loop.

Another interesting case​:

[11​:40​:10] bucko(tank) ~$ perl -Mthreads -Mthreads​::shared -e 'my $a : shared; $a = \$a; for(1..5){print "$a\n" for (1..2); $a = $$a}' SCALAR(0x81f2db8) SCALAR(0x815054c) SCALAR(0x813bc0c) SCALAR(0x81f2db8) SCALAR(0x815054c) SCALAR(0x813bc0c) SCALAR(0x81f2db8) SCALAR(0x815054c) SCALAR(0x813bc0c) SCALAR(0x81f2db8)

Creating a third value\, and note how they run cyclically still. This could imply it's to do with whenever the variable gets dereferenced.

Finally\, this one appears to have some /immense/ number of possible values​:

[11​:43​:56] bucko(tank) ~$ perl -Mthreads -Mthreads​::shared -e 'my $a : shared = &share({}); $a->{a} = $a; $a->{b} = "f"; for(1..5){print "$_ $a $a->{a} $a->{b}\n" for (1..3); $a = $a->{a}}' 1 HASH(0x81f2fa0) HASH(0x81f306c) f 2 HASH(0x813babc) HASH(0x81f3084) f 3 HASH(0x81f309c) HASH(0x81f303c) f 1 HASH(0x81f2fa0) HASH(0x81f2fb8) f 2 HASH(0x81f309c) HASH(0x814f2c4) f 3 HASH(0x81f3030) HASH(0x81f3060) f 1 HASH(0x81f2fa0) HASH(0x81f2ff4) f 2 HASH(0x81f3030) HASH(0x81f30b4) f 3 HASH(0x813bc0c) HASH(0x81f2fac) f 1 HASH(0x81f2fa0) HASH(0x81f3054) f 2 HASH(0x813bc0c) HASH(0x813babc) f 3 HASH(0x81f3018) HASH(0x81f3090) f 1 HASH(0x81f2fa0) HASH(0x81f303c) f 2 HASH(0x81f3018) HASH(0x81f309c) f 3 HASH(0x81f3078) HASH(0x81f3084) f

Note how there however /are/ patterns in the apparent mess.

A simple hash based unique value finder tells me there's 12 values here\, if you don't keep dereferencing $a into itself. Dereferencing $a gives "number of dereferences plus 2" ie 10002 possible values\, in the following code​:

my %count; my $a : shared = &share({}); $a->{a} = $a; for(1..10000) {   $count{"$a"}++;   $count{"$a->{a}"}++;   $a = $a->{a}; } print scalar keys %count; print "\n";

(Run through perl -Mthreads -Mthreads​::shared)

Bizarrely\, when I reformatted the code like this​:

my %count; my $a : shared = &share({}); $a->{a} = $a; for(1..10000) {   my $temp = "$a";   print "$_ $temp ";   $count{ $temp }++;   $temp = "$a->{a}";   print "$temp\n";   $count{ $temp }++;   $a = $a->{a}; } print scalar keys %count; print "\n";

I got 20000 values. Removing the my qualifier on $temp does nothing here\, so it's presumably not to do with variable allocation. Removing the print statements also does nothing. Reformatting the body of the loop like this​:

  $count{ $temp = "$a" }++;   $count{ $temp = "$a->{a}" }++;   $a = $a->{a};

Brings me back to 10002\, however.

Perhaps scarily\, the following body gives 20 unique values for the same 10000 loops​:

  $count{ $temp = "$a" }++;   $count{ $temp = "$a->{a}" }++;   $count{ $temp = "$a->{a}->{a}" }++;   $a = $a->{a};

Even more worrying\, the following body gives /11/ unique values​:

  $count{ $temp = "$a" }++;   $count{ $temp = "$a->{a}" }++;   $count{ $temp = "$a->{a}->{a}" }++;   $a = $a->{a};   $a = $a->{a};

Adding a further deference brings it up to 14. Combining the two lines into one gives 1679 values. Consistently.

So\, yes\, I think I'll leave this up to someone who actually knows what the hell is going on here. :)

Apologies for the somewhat inconsistent code etc. I was basically just cut-n-paste playing around with the script\, so some bits from elsewhere got left in bits they shouldn't have been. I'd test on other systems\, but I only have Linux to play with)

HTH

-- bucko

p5pRT commented 21 years ago

From perl_dummy@bloodgate.com

-----BEGIN PGP SIGNED MESSAGE-----

Moin David\,

It seems to just alternate. Different threads can\, however\, appear to read different vales\, as shown here​:

Strange. But an idea particle suddenly hit me while reading this. There _was_ a memory leak with shared arrays in threads (or something along these lines\, threads just scare me) and that should be fixed in 5.8.1-to-be. You could try RC5 from

  http​://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2003-09/msg01289.html

there or the latest bleadperl. If you could give these a spin and see if the issue persist\, than that would be easier than to trying tracking down the mess.

Best wishes\,

Te"Armchair Debugger Extraordinaire"ls

- -- Signed on Tue Sep 23 18​:07​:21 2003 with key 0x93B84C15. Visit my photo gallery at http​://bloodgate.com/photos/ PGP key on http​://bloodgate.com/tels.asc or per email.

"Naturally the parameter and boundary of their respective position and magnitude are naturally determinable up to the limits of possible measurement as stated by the general quantum hypothesis and Heisenberg's uncertainty principle\, but this indeterminacy in precise value is not a consequence of quantum uncertainty. What this illustrates is that in relation to indeterminacy in precise physical magnitude\, the micro and macroscopic are inextricably linked\, both being a part of the same parcel\, rather than just a case of the former underlying and contributing to the latter." -- Peter Lynd

-----BEGIN PGP SIGNATURE----- Version​: GnuPG v1.2.2-rc1-SuSE (GNU/Linux) Comment​: When cryptography is outlawed\, bayl bhgynjf jvyy unir cevinpl.

iQEVAwUBP3BwgHcLPEOTuEwVAQHD1Af+NoJPeXhQoKXHAdurrCgC2VVocXfnChnF z7EDGsTBDmfuD71QNWpbFwoElVPJrluqucnvsKoVKBsBvhV4aVvaQed4wgzt2aAa IGZddUC7jCns5HfhgeR/DC5kuGy/yJfJ20zQ/PkuvMbb29NpYvXKxHHK6CKvVQtB aGM0RqnuKXxlcPeo93C1Y2zgktxU+LI9esz1mIUbzbizX6AWtaZap9cjUGLqHisu I5YZXo1/+w5DrRZVFk6I0Yy4yzOzhVmMlV+mE7R+aHsYI41CA0i0KSkIqRkwFpDN aT/phwrfnCbmVawGBhSELvelETqqsjhsHYQ0VKhgWoQMTwaMvGGwPg== =c1jD -----END PGP SIGNATURE-----

p5pRT commented 21 years ago

From DavidBuckley@bigfoot.com

On 23 Sep 2003\, Tels wrote​:

-----BEGIN PGP SIGNED MESSAGE-----

Moin David\,

It seems to just alternate. Different threads can\, however\, appear to read different vales\, as shown here​:

Strange. But an idea particle suddenly hit me while reading this. There _was_ a memory leak with shared arrays in threads (or something along these lines\, threads just scare me) and that should be fixed in 5.8.1-to-be. You could try RC5 from

http&#8203;://www\.xray\.mpe\.mpg\.de/mailing\-lists/perl5\-porters/2003\-09/msg01289\.html

there or the latest bleadperl. If you could give these a spin and see if the issue persist\, than that would be easier than to trying tracking down the mess.

Just did so\, but the bug remains. I'm attaching configuration info below. (I used the bz2 file at the top of the message)

Inspired by your comments on the memory leak\, I did some stats. Using the "hash counter" thing I gave at the end of the last message\, I've brought Perl up to lots of memoy usage by recursively dereferencing a shared hash\, so it probably /is/ leaking memory.

Sample low-iteration run​:

[18​:49​:29] bucko(tank) ~/perl_5.8.1-RC5/bin$ ./perl -Mthreads -Mthreads​::shared my %count; my $a : shared = &share({}); $a->{a} = $a; for(1..10000) {   $count{"$a"}++;   $count{"$a->{a}"}++;   $a = $a->{a}; } print scalar keys %count; print "\n"; 10002

Summary of my perl5 (revision 5.0 version 8 subversion 1) configuration​:   Platform​:   osname=linux\, osvers=2.4.21\, archname=i686-linux-thread-multi-ld   uname='linux tank 2.4.21 #1 fri jul 25 00​:22​:37 bst 2003 i686 gnulinux '   config_args=''   hint=recommended\, useposix=true\, d_sigaction=define   usethreads=define use5005threads=undef useithreads=define usemultiplicity=define   useperlio=define d_sfio=undef uselargefiles=define usesocks=undef   use64bitint=undef use64bitall=undef uselongdouble=define   usemymalloc=n\, bincompat5005=undef   Compiler​:   cc='cc'\, ccflags ='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -fno-strict-aliasing -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64'\,   optimize='-O3'\,   cppflags='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -fno-strict-aliasing -I/usr/local/include'   ccversion=''\, gccversion='3.3.2 20030812 (Debian prerelease)'\, gccosandvers=''   intsize=4\, longsize=4\, ptrsize=4\, doublesize=8\, byteorder=1234   d_longlong=define\, longlongsize=8\, d_longdbl=define\, longdblsize=12   ivtype='long'\, ivsize=4\, nvtype='long double'\, nvsize=12\, Off_t='off_t'\, lseeksize=8   alignbytes=4\, prototype=define   Linker and Libraries​:   ld='cc'\, ldflags =' -L/usr/local/lib'   libpth=/usr/local/lib /lib /usr/lib   libs=-lnsl -ldb -ldl -lm -lcrypt -lutil -lpthread -lc   perllibs=-lnsl -ldl -lm -lcrypt -lutil -lpthread -lc   libc=/lib/libc-2.3.2.so\, so=so\, useshrplib=false\, libperl=libperl.a   gnulibc_version='2.3.2'   Dynamic Linking​:   dlsrc=dl_dlopen.xs\, dlext=so\, d_dlsymun=undef\, ccdlflags='-rdynamic'   cccdlflags='-fpic'\, lddlflags='-shared -L/usr/local/lib'

Characteristics of this binary (from libperl)​:   Compile-time options​: MULTIPLICITY USE_ITHREADS USE_LONG_DOUBLE USE_LARGE_FILES PERL_IMPLICIT_CONTEXT   Locally applied patches​:   RC5   Built under linux   Compiled at Sep 23 2003 18​:39​:06   @​INC​:   /home/bucko/perl_5.8.1-RC5/lib/5.8.1/i686-linux-thread-multi-ld   /home/bucko/perl_5.8.1-RC5/lib/5.8.1   /home/bucko/perl_5.8.1-RC5/lib/site_perl/5.8.1/i686-linux-thread-multi-ld   /home/bucko/perl_5.8.1-RC5/lib/site_perl/5.8.1   /home/bucko/perl_5.8.1-RC5/lib/site_perl   .

-- bucko

p5pRT commented 21 years ago

From @iabyn

On Mon\, Sep 22\, 2003 at 05​:01​:19PM -0000\, David Buckley wrote​:

use threads; use threads​::shared; use Data​::Dumper; our $a : shared; $a = \$a; print Data​::Dumper​::Dumper( $a );

This produces a short pause\, then a segfault.

The segfault is due to stack or memory exhaustion caused by infinite recursion. The problem is due to the 'proxy' per-thread variables failing to detect a loop caused by mg_get(). To explain further​: each shared SV lives in a special shared area; each thread has a stub 'proxy' SV with magic. When the the proxy SV is accessed\, its magic is called\, which locks the real SV\, gets/sets its real value\, and updates the proxy SV to reflect that value.

In the case of $a = \$a\, the real SV is 'correct'​: it's an RV which points to itself. When the proxy (ie the $a in the main thread) is accessed\, mg_get is called; it realises that the real $a is an RV\, so it creates a tmp RV with shared magic and makes the proxy point to this. After the mg_get\, you follow through to the thing the proxy $a is pointing to. This is the tmp SV with the shared magic. So you call mg_get on it\, which repreats the process ad nauseum.

Note that this is a bug in the shared var implementation\, not in Data​::Dumper. The following code demonstrates​:

  use threads;   use threads​::shared;

  my $a : shared;

  $a = \$a;

  my $x = $a;   for (1..10) {   print "$x\n";   $x = $$x;   }

outputs​:

  scalar(0x81a3b08)   scalar(0x81bbf00)   scalar(0x8213930)   scalar(0x81c41b8)   scalar(0x81bbeb8)   scalar(0x8207fd4)   scalar(0x8211f14)   scalar(0x8220c3c)   scalar(0x81bacdc)   scalar(0x8258a10)

I don't know whether the other parts of the bug report are related.

And no\, I don't know how to fix it. (Well\, I could see maybe how to fix it for the trivial $a=\$a case\, but not for more general self-referential data structures.)

Dave.

-- "There's something wrong with our bloody ships today\, Chatfield." Admiral Beatty at the Battle of Jutland\, 31st May 1916.

p5pRT commented 19 years ago

From @schwern

[davem - Tue Sep 23 17​:09​:43 2003]​:

On Mon\, Sep 22\, 2003 at 05​:01​:19PM -0000\, David Buckley wrote​:

use threads; use threads​::shared; use Data​::Dumper; our $a : shared; $a = \$a; print Data​::Dumper​::Dumper( $a );

This produces a short pause\, then a segfault.

The segfault is due to stack or memory exhaustion caused by infinite recursion. The problem is due to the 'proxy' per-thread variables failing to detect a loop caused by mg_get().

This is still an issue in bleadperl@​25129.

p5pRT commented 18 years ago

From @iabyn

On Thu\, Jul 14\, 2005 at 01​:58​:08AM -0700\, Michael G Schwern via RT wrote​:

[davem - Tue Sep 23 17​:09​:43 2003]​:

On Mon\, Sep 22\, 2003 at 05​:01​:19PM -0000\, David Buckley wrote​:

use threads; use threads​::shared; use Data​::Dumper; our $a : shared; $a = \$a; print Data​::Dumper​::Dumper( $a );

This produces a short pause\, then a segfault.

The segfault is due to stack or memory exhaustion caused by infinite recursion. The problem is due to the 'proxy' per-thread variables failing to detect a loop caused by mg_get().

This is still an issue in bleadperl@​25129.

Fixed in bleed by change #26695

-- A walk of a thousand miles begins with a single step... then continues for another 1\,999\,999 or so.

p5pRT commented 18 years ago

@iabyn - Status changed from 'open' to 'resolved'