Perl / perl5

🐪 The Perl programming language
https://dev.perl.org/perl5/
Other
1.92k stars 548 forks source link

New thread.t test; reveals Queue bugs on multiprocessor systems #913

Closed p5pRT closed 20 years ago

p5pRT commented 24 years ago

Migrated from rt.perl.org#1848 (status was 'resolved')

Searchable as RT1848$

p5pRT commented 24 years ago

From rkc@sst.ll.mit.edu

----------------------------------------------------------------- The following (almost) drop-in replacement for thread.t test program reveals bugs in Thread​:Queue that are indicative of a race condition. The bug only appears on a dual-processor system; single processor systems do not exhibit this behavior. I have more complicated versions of the included tests (not in this report) that can cause perl to dump core. Unfortunately\, the core file gets corrupted as it is formed (why--I don't know) and is not readable by gdb. If someone develops a patch to make this simple test program work I will be happy to apply it and try the more complicated example.

Beware​: I am using a version of 5.00562 that includes Brian Mancuso's thread/regexp patch. I do not believe that this is the cause of the problem\, but it should be easy enough for others to check. I do not have an unpatched 5.00562 on my system\, as it is unuseable for my development.

Sample output (from Solaris 2.6\, dual-processor system\, details below)​: % ./thread.t 1..21 ok 1 ok 2 ok 3 ok 4 ok 5 ok 6 ok 7 ok 8 ok 9 ok 10 ok 11 ok 12 ok 13 ok 14 ok 15 ok 16 ok 17 ok 18 ok 19 13 Got 30\, expected 1 13 Got 531\, expected 2 13 Got 3\, expected 528 13 Got 22\, expected 613 13 Got 31\, expected 614 14 Got 626\, expected 45 14 Got 645\, expected 52 14 Got 665\, expected 65 13 Got 67\, expected 671 13 Got 75\, expected 672 13 Got 84\, expected 673 14 Got 702\, expected 117 13 Got 119\, expected 709 13 Got 127\, expected 710 14 Got 720\, expected 141 13 Got 143\, expected 727 14 Got 729\, expected 150 13 Got 152\, expected 736 13 Got 160\, expected 737 14 Got 753\, expected 180 13 Got 182\, expected 760 13 Got 190\, expected 761 14 Got 763\, expected 196 13 Got 198\, expected 770 14 Got 833\, expected 265 14 Got 852\, expected 277 14 Got 859\, expected 278 14 Got 866\, expected 279 14 Got 873\, expected 280 14 Got 880\, expected 281 14 Got 887\, expected 282 13 Got 284\, expected 893 14 Got 946\, expected 341 14 Got 972\, expected 360 14 Got 30\, expected 988 0 (2) Result is 0 15 Got 587\, expected 582 Attempt to free unreferenced scalar at ./thread.t line 138 thread 15. 0 (3) Result is 0 not ok 20...single source/multi sink queue ok 21

thread.t source code​: #!/opt/local/bin/perlthr

BEGIN {   chdir 't' if -d 't';   unshift @​INC\, '../lib';   require Config; import Config;   if (! $Config{'usethreads'}) {   print "1..0 # Skip​: this perl is not threaded\n";   exit 0;   }

  # XXX known trouble with global destruction   $ENV{PERL_DESTRUCT_LEVEL} = 0 unless $ENV{PERL_DESTRUCT_LEVEL} > 3; } $| = 1; print "1..21\n"; use Thread; use Thread​::Queue; print "ok 1\n";

sub content { print shift; return shift; }

# create a thread passing args and immedaietly wait for it. my $t = new Thread \&content\,("ok 2\n"\,"ok 3\n"\, 1..1000); print $t->join;

# check that lock works ... {lock $foo; $t = new Thread sub { lock $foo; print "ok 5\n" }; print "ok 4\n"; } $t->join;

sub dorecurse { my $val = shift; my $ret; print $val; if (@​_)   {   $ret = Thread->new(\&dorecurse\, @​_);   $ret->join;   } }

$t = new Thread \&dorecurse\, map { "ok $_\n" } 6..10; $t->join;

# test that sleep lets other thread run $t = new Thread \&dorecurse\,"ok 11\n"; sleep 6; print "ok 12\n"; $t->join;

sub islocked : locked { my $val = shift; my $ret; print $val; if (@​_)   {   $ret = Thread->new(\&islocked\, shift);   } $ret; }

$t = Thread->new(\&islocked\, "ok 13\n"\, "ok 14\n"); $t->join->join;

{   package Loch​::Ness;   sub new { bless []\, shift }   sub monster : locked\, method {   my($s\, $m) = @​_;   print "ok $m\n";   }   sub gollum { &monster } } Loch​::Ness->monster(15); Loch​::Ness->new->monster(16); Loch​::Ness->gollum(17); Loch​::Ness->new->gollum(18);

# # Added by RKC # Check that queues work

# First one source and one sink # (Borrowed from the tutorial) my @​data = 1..1000; my $element; my $DataQueue = new Thread​::Queue; my $thr; my $result; sub q_thread{   my $status = 1;   my $i = 0;   while ($DataElement = $DataQueue->dequeue) {   if ($data[$i++] != $DataElement){   $status = 0;   }   }   return $status; };

$thr = new Thread \&q_thread; foreach $element (@​data) {   $DataQueue->enqueue($element); } yield; $DataQueue->enqueue(undef); $result = eval {$thr->join}; if ($result) { print "ok 19\n"; } else { print "not ok 19...single source/sink queue\n";}

# Now one source and several sinks # (Borrowed from the tutorial) my @​DataQueue; my @​Threads; my $sinks = 4;

sub qn_thread{   my $tid = shift;   my $status = 1;   my $i = 0;   my $dq = $DataQueue[$tid]; # Can't use self value--tid keeps incrementing   while ($DataElement = $dq->dequeue) {   yield;   if ($data[$i++] != $DataElement){   # Remove this debugging line once this bug is fixed   print Thread->self->tid\," Got $DataElement\, expected $data[$i-1]\n";   $status = 0;   }   }   return $status; };

for ($i=0;$i\<$sinks;$i++){
  $DataQueue[$i] = new Thread​::Queue;   $Threads[$i] = new Thread \&qn_thread\, $i; }

$result = 1; for ($i=0;$i\<$sinks;$i++){
  foreach $element (@​data) {   $DataQueue[$i]->enqueue($element);   }   $DataQueue[$i]->enqueue(undef); } for ($i=0;$i\<$sinks;$i++){
  # Remove this debugging line once this bug is fixed   print Thread->self->tid\," ($i) Result is $result\n" if (! $result);   $result &= eval {$Threads[$i]->join}; } if ($result) { print "ok 20\n"; } else { print "not ok 20...single source/multi sink queue\n";}

# Verify that we can detach and still finish sub sleeper{   sleep(1); } $thr = new Thread \&sleeper; $thr->detach; # I add this sleep here so that the message below is less likely # to appear when we're just waiting sleep(2);
# Waiting for threads to finish... # There really shouldn't be anything {   my @​list = Thread->list();   while (@​list > 1) {   print STDERR "Thread test​: "\,scalar(@​list)\," threads remaining\n";   sleep (5);   @​list = Thread->list();   } } print "ok 21\n";

=============================== End of thread.t

Perl Info ``` Site configuration information for perl 5.00562: Configured by mwinship at Mon Nov 29 14:17:09 EST 1999. Summary of my perl5 (revision 5.0 version 5 subversion 62) configuration: Platform: osname=solaris, osvers=2.6, archname=sun4-solaris-thread uname='sunos gemini 5.6 generic_105181-16 sun4u sparc sunw,ultra-1 ' config_args='-Dusethreads' hint=recommended, useposix=true, d_sigaction=define usethreads=define useperlio=undef d_sfio=undef use64bits=undef usemultiplicity=undef Compiler: cc='gcc', optimize='-O', gccversion=2.7.2.3 cppflags='-D_REENTRANT -I/usr/local/include -I/opt/local/include' ccflags ='-D_REENTRANT -I/usr/local/include -I/opt/local/include' stdchar='unsigned char', d_stdstdio=define, usevfork=false intsize=4, longsize=4, ptrsize=4, doublesize=8 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16 alignbytes=8, usemymalloc=y, prototype=define Linker and Libraries: ld='gcc', ldflags =' -L/usr/local/lib -L/opt/local/lib' libpth=/usr/local/lib /opt/local/lib /lib /usr/lib /usr/ccs/lib libs=-lsocket -lnsl -ldb -ldl -lm -lposix4 -lpthread -lc -lcrypt -lsec libc=/lib/libc.so, so=so, useshrplib=false, libperl=libperl.a Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=, d_dlsymun=undef, ccdlflags=' ' cccdlflags='-fPIC', lddlflags='-G -L/usr/local/lib -L/opt/local/lib' Locally applied patches: @INC for perl 5.00562: /usr/local/lib/perl5/5.00562/sun4-solaris-thread /usr/local/lib/perl5/5.00562 /usr/local/lib/site_perl/5.00562/sun4-solaris-thread /usr/local/lib/site_perl . Environment for perl 5.00562: HOME=/home/rkc LANG (unset) LANGUAGE (unset) LD_LIBRARY_PATH=/usr/openwin/lib:/usr/dt/lib:/usr/local/lib:/opt/local/gnu/lib:/opt/local/SUNWspro/SC4.2/lib:/opt/local/rvplayer5.0 LOGDIR (unset) PATH=/home/rkc/bin:/opt/local/hycd/bin/SOLARIS26:/usr/bin:/usr/sbin:/usr/openwin/bin:/usr/ccs/bin:/opt/local/gnu/bin:/opt/local/bin:/opt/local/hosts:/opt/local/esps/bin:/opt/local/matlab/bin:/opt/local/mule/bin:/opt/local/SUNprint/bin:/opt/local/bin/transcript:/opt/local/frame/bin:/opt/local/SUNWspro/bin:/opt/local/ace/prog:/opt/local/netaudio/bin:/opt/local/budtool/bin:/opt/local/SoftWindows/bin:/opt/local/rvplayer5.0:/opt/local/java/HotJava/bin:/opt/local/HTK_V2.2/bin:/u/maz/langid/bin:/usr/local/etc:/usr/local/mysql/bin:/opt/local/transcriber/bin:/opt/local/SunOS4/bin:/usr/ucb:/data/id/bin:/data/nn0/kukolich/lnknet/bin:/data/id2/rpl/bin PERL_BADLANG (unset) SHELL=/opt/local/bin/tcsh ```
p5pRT commented 24 years ago

From [Unknown Contact. See original ticket]

Update​: I just ran the test program many times on a uniprocessor system; the bug does appear on the single processor system. It just appears much more sporadically.

Rob

Rob Cunningham wrote​:

Your e-mail has been received by the Perl Bug Squashing Team.

Ticket ID '[ID 19991203.003]' has been assigned. Please include this ticket ID in the subject line of any followup messages related to this issue.

This is an automatic confirmation message. -- Perl Bug Squashing Team perlbug@​perl.org

-- Dr. Robert K. Cunningham Information System Technology Group   rkc@​ll.mit.edu MIT Lincoln Laboratory
*** My comments\, my opinions​: my responsibility. PGP key available from http​://pgpkeys.mit.edu​:11371

p5pRT commented 24 years ago

From [Unknown Contact. See original ticket]

At 02​:43 PM 12/3/99 -0500\, Rob Cunningham wrote​:

The following (almost) drop-in replacement for thread.t test program reveals bugs in Thread​:Queue that are indicative of a race condition.

I'm not entirely sure that it does. Mainly because of this​:

sub qn_thread{ my $tid = shift; my $status = 1; my $i = 0; my $dq = $DataQueue[$tid]; # Can't use self value--tid keeps incrementing while ($DataElement = $dq->dequeue) {

$DataElement's not a lexical\, so all the threads are sharing a single package variable and will eventually stomp on it in evil ways.

  Dan

----------------------------------------"it's like this"------------------- Dan Sugalski even samurai dan@​sidhe.org have teddy bears and even   teddy bears get drunk

p5pRT commented 24 years ago

From [Unknown Contact. See original ticket]

Dan\, Thanks for pointing out my error...So much for my "simple" test program. (Rob fixes..retests...)

It turns out that if you define DataQueue as a "my" variable\, a similar problem happens\, although much less frequently. The following occurred on a dual-processor Solaris system (details in first message) after about four runs. I ran the same test sixty times on a uniprocessor Solaris system\, but never saw the same behavior.

% ./thread.t 1..21 ok 1 ok 2 ok 3 ok 4 ok 5 ok 6 ok 7 ok 8 ok 9 ok 10 ok 11 ok 12 ok 13 ok 14 ok 15 ok 16 ok 17 ok 18 ok 19 15 Got 114\, expected 71 15 Got 396\, expected 114 Attempt to free unreferenced scalar at ./thread.t line 141 thread 15. Segmentation fault (core dumped) % gdb perl gdb perl GDB is free software and you are welcome to distribute copies of it under certain conditions; type "show copying" to see the conditions. There is absolutely no warranty for GDB; type "show warranty" for details. GDB 4.16 (sparc-sun-solaris2.5)\, Copyright 1996 Free Software Foundation\, Inc... (no debugging symbols found)... (gdb) core core core core "/data/id/rkc/devel/core" is not a core dump​: File format not recognized (gdb) quit quit draco​:devel% gdb perlthr gdb perlthr GDB is free software and you are welcome to distribute copies of it under certain conditions; type "show copying" to see the conditions. There is absolutely no warranty for GDB; type "show warranty" for details. GDB 4.16 (sparc-sun-solaris2.5)\, Copyright 1996 Free Software Foundation\, Inc...

(gdb) core core core core "/data/id/rkc/devel/core" is not a core dump​: File format not recognized (gdb) quit

================================================ Updated thread.t test program. #!/opt/local/bin/perlthr

BEGIN {   chdir 't' if -d 't';   unshift @​INC\, '../lib';   require Config; import Config;   if (! $Config{'usethreads'}) {   print "1..0 # Skip​: this perl is not threaded\n";   exit 0;   }

  # XXX known trouble with global destruction   $ENV{PERL_DESTRUCT_LEVEL} = 0 unless $ENV{PERL_DESTRUCT_LEVEL} > 3; } $| = 1; print "1..21\n"; use Thread; use Thread​::Queue; print "ok 1\n";

sub content { print shift; return shift; }

# create a thread passing args and immedaietly wait for it. my $t = new Thread \&content\,("ok 2\n"\,"ok 3\n"\, 1..1000); print $t->join;

# check that lock works ... {lock $foo; $t = new Thread sub { lock $foo; print "ok 5\n" }; print "ok 4\n"; } $t->join;

sub dorecurse { my $val = shift; my $ret; print $val; if (@​_)   {   $ret = Thread->new(\&dorecurse\, @​_);   $ret->join;   } }

$t = new Thread \&dorecurse\, map { "ok $_\n" } 6..10; $t->join;

# test that sleep lets other thread run $t = new Thread \&dorecurse\,"ok 11\n"; sleep 6; print "ok 12\n"; $t->join;

sub islocked : locked { my $val = shift; my $ret; print $val; if (@​_)   {   $ret = Thread->new(\&islocked\, shift);   } $ret; }

$t = Thread->new(\&islocked\, "ok 13\n"\, "ok 14\n"); $t->join->join;

{   package Loch​::Ness;   sub new { bless []\, shift }   sub monster : locked\, method {   my($s\, $m) = @​_;   print "ok $m\n";   }   sub gollum { &monster } } Loch​::Ness->monster(15); Loch​::Ness->new->monster(16); Loch​::Ness->gollum(17); Loch​::Ness->new->gollum(18);

# # Added by RKC # Check that queues work

# First one source and one sink # (Borrowed from the tutorial) my @​data = 1..1000; my $element; my $DataQueue = new Thread​::Queue; my $thr; my $result; sub q_thread{   my $status = 1;   my $i = 0;   my $DataElement;   while ($DataElement = $DataQueue->dequeue) {   if ($data[$i++] != $DataElement){   $status = 0;   }   }   return $status; };

$thr = new Thread \&q_thread; foreach $element (@​data) {   $DataQueue->enqueue($element); } yield; $DataQueue->enqueue(undef); $result = eval {$thr->join}; if ($result) { print "ok 19\n"; } else { print "not ok 19...single source/sink queue\n";}

# Now one source and several sinks # (Borrowed from the tutorial) my @​DataQueue; my @​Threads; my $sinks = 4;

sub qn_thread{   my $tid = shift;   my $status = 1;   my $i = 0;   my $DataElement;   my $dq = $DataQueue[$tid]; # Can't use self value--tid keeps incrementing   while ($DataElement = $dq->dequeue) {   yield;   if ($data[$i++] != $DataElement){   # Remove this debugging line once this bug is fixed   print Thread->self->tid\," Got $DataElement\, expected $data[$i-1]\n";   $status = 0;   }   }   return $status; };

for ($i=0;$i\<$sinks;$i++){   $DataQueue[$i] = new Thread​::Queue;   $Threads[$i] = new Thread \&qn_thread\, $i; }

$result = 1; for ($i=0;$i\<$sinks;$i++){   foreach $element (@​data) {   $DataQueue[$i]->enqueue($element);   }   $DataQueue[$i]->enqueue(undef); } for ($i=0;$i\<$sinks;$i++){   # Remove this debugging line once this bug is fixed   print Thread->self->tid\," ($i) Result is $result\n" if (! $result);   $result &= eval {$Threads[$i]->join}; } if ($result) { print "ok 20\n"; } else { print "not ok 20...single source/multi sink queue\n";}

# Verify that we can detach and still finish sub sleeper{   sleep(1); } $thr = new Thread \&sleeper; $thr->detach; # I add this sleep here so that the message below is less likely # to appear when we're just waiting sleep(2);
# Waiting for threads to finish... # There really shouldn't be anything {   my @​list = Thread->list();   while (@​list > 1) {   print STDERR "Thread test​: "\,scalar(@​list)\," threads remaining\n";   sleep (5);   @​list = Thread->list();   } } print "ok 21\n";

Dan Sugalski wrote​:

At 02​:43 PM 12/3/99 -0500\, Rob Cunningham wrote​:

The following (almost) drop-in replacement for thread.t test program reveals bugs in Thread​:Queue that are indicative of a race condition.

I'm not entirely sure that it does. Mainly because of this​:

sub qn_thread{ my $tid = shift; my $status = 1; my $i = 0; my $dq = $DataQueue[$tid]; # Can't use self value--tid keeps incrementing while ($DataElement = $dq->dequeue) {

$DataElement's not a lexical\, so all the threads are sharing a single package variable and will eventually stomp on it in evil ways.

                                    Dan

----------------------------------------"it's like this"------------------- Dan Sugalski even samurai dan@​sidhe.org have teddy bears and even teddy bears get drunk

-- Dr. Robert K. Cunningham Information System Technology Group   rkc@​ll.mit.edu MIT Lincoln Laboratory
*** My comments\, my opinions​: my responsibility. PGP key available from http​://pgpkeys.mit.edu​:11371