Open p5pRT opened 13 years ago
Consider the following test case:
---- snip ---- use strict; use warnings; use File::Copy qw( copy );
$SIG{INT} = sub { die };
copy( $ARGV[0]\, $ARGV[1] ); ---- snip ----
Then call as "perl test.pl \
---- snip ---- [jschmidt@mstest tmp]$ perl test.pl yyy zzz Died at test.pl line 5. refcnt_dec: fd 4: 0 \<= 0 Quit ---- snip ----
and Perl hangs. I took a stack trace of the hanging Perl:
---- snip ---- [jschmidt@unix sbp]$ pstack 5127 #0 0x0022b402 in __kernel_vsyscall () #1 0x003a392e in __lll_mutex_lock_wait () from /lib/libpthread.so.0 #2 0x0039f79c in _L_mutex_lock_85 () from /lib/libpthread.so.0 #3 0x0039f2dd in pthread_mutex_lock () from /lib/libpthread.so.0 #4 0x0813dc7a in PerlIOUnix_refcnt_dec () #5 0x0813dd78 in PerlIOUnix_close () #6 0x0813e777 in PerlIOBase_close () #7 0x0813e7b1 in PerlIOBuf_close () #8 0x0813fa2b in Perl_PerlIO_close () #9 0x08120dc6 in Perl_io_close () #10 0x080e2eb4 in Perl_sv_clear () #11 0x080e31ba in Perl_sv_free2 () #12 0x0807abcf in Perl_gp_free () #13 0x080e2fcb in Perl_sv_clear () #14 0x080e31ba in Perl_sv_free2 () #15 0x080fd89b in Perl_leave_scope () #16 0x0807188b in S_my_exit_jump () #17 0x0807194f in Perl_my_failure_exit () #18 0x080bcaa1 in Perl_vcroak () #19 0x080bcb25 in Perl_croak () #20 0x0813dd30 in PerlIOUnix_refcnt_dec () #21 0x0813dd78 in PerlIOUnix_close () #22 0x0813e777 in PerlIOBase_close () #23 0x0813e7b1 in PerlIOBuf_close () #24 0x0813fa2b in Perl_PerlIO_close () #25 0x08120dc6 in Perl_io_close () #26 0x080e2eb4 in Perl_sv_clear () #27 0x080e31ba in Perl_sv_free2 () #28 0x0807abcf in Perl_gp_free () #29 0x080e2fcb in Perl_sv_clear () #30 0x080e31ba in Perl_sv_free2 () #31 0x080fd89b in Perl_leave_scope () #32 0x0807188b in S_my_exit_jump () #33 0x0807194f in Perl_my_failure_exit () #34 0x081096df in Perl_die_where () #35 0x080bf996 in Perl_vdie () #36 0x080bfa45 in Perl_die () #37 0x080c075f in Perl_sighandler () #38 0x080c0437 in Perl_despatch_signals () #39 0x0813dd9c in PerlIOUnix_close () #40 0x0813e777 in PerlIOBase_close () #41 0x0813e7b1 in PerlIOBuf_close () #42 0x0813fa2b in Perl_PerlIO_close () #43 0x08120dc6 in Perl_io_close () #44 0x08120f73 in Perl_do_close () #45 0x0811411e in Perl_pp_close () #46 0x080cfde9 in Perl_runops_standard () #47 0x08074c79 in perl_run () #48 0x0805fe4d in main () ---- snip ----
Here is a listing if the file descriptors at hang time (yyy being the source file of the copy function):
---- snip ---- [jschmidt@unix sbp]$ ls -al /proc/5127/fd/ total 0 dr-x------ 2 sbpadm oraic 0 Dec 20 09:26 . dr-xr-xr-x 5 sbpadm oraic 0 Dec 20 09:26 .. lrwx------ 1 sbpadm oraic 64 Dec 20 09:27 0 -> /dev/pts/25 lrwx------ 1 sbpadm oraic 64 Dec 20 09:27 1 -> /dev/pts/25 lrwx------ 1 sbpadm oraic 64 Dec 20 09:26 2 -> /dev/pts/25 lr-x------ 1 sbpadm oraic 64 Dec 20 09:27 3 -> /net/sapmnt.oraicall/sbpmstest/tmp/yyy ---- snip----
I tried to simulate what File::Copy::copy does in my test program as well\, using various different methods (open\, sysopen) to access the files\, but the error occurred in all cases.
Reproduces in perl 5.12.2 as well (same host as above).
jens.schmidt35@arcor.de - Status changed from 'new' to 'open'
jens.schmidt35@arcor.de - Status changed from 'open' to 'new'
On Mon\, Dec 20\, 2010 at 12:44:33AM -0800\, jens.schmidt35@arcor.de wrote:
Consider the following test case:
---- snip ---- use strict; use warnings; use File::Copy qw( copy );
$SIG{INT} = sub { die };
copy( $ARGV[0]\, $ARGV[1] ); ---- snip ----
Then call as "perl test.pl \
\ " and hit Ctrl-C while it runs. In 90% of the cases I get the following error message: ---- snip ---- [jschmidt@mstest tmp]$ perl test.pl yyy zzz Died at test.pl line 5. refcnt_dec: fd 4: 0 \<= 0 Quit ---- snip ----
and Perl hangs. I took a stack trace of the hanging Perl:
I suspect this bug (recursively calling into PerlIO via a signal handler) was fixed by commit abf9167d3fff002ddaed53abb44d638387bca978\, which is included in perl-5.14.0-RC1.
However\, I was unable to reproduce this with 5.10.1 or any other perl\, so I can't confirm that its been fixed.
-- Spock (or Data) is fired from his high-ranking position for not being able to understand the most basic nuances of about one in three sentences that anyone says to him. -- Things That Never Happen in "Star Trek" #19
The RT System itself - Status changed from 'new' to 'open'
Thanks for looking into this.
To a pity\, the problem still reproduces with perl-5.14.0-RC1.
I did some more tests (both with perl 5.10 and 5.14)\, and the problem reproduces easiest (80-90%) if both source and target file of the copy are located on an NFS-mounted volume. It reproduces less easier (10-20%) if only the target file is on an NFS-mounted volume. (NetApp Filer over a 1GBit network.)
It does not reproduce at all if both source and target are on a local disk.
Hope that helps.
On Tue Apr 26 06:21:18 2011\, davem wrote:
On Mon\, Dec 20\, 2010 at 12:44:33AM -0800\, jens.schmidt35@arcor.de wrote: I suspect this bug (recursively calling into PerlIO via a signal handler) was fixed by commit abf9167d3fff002ddaed53abb44d638387bca978\, which is included in perl-5.14.0-RC1.
Ups\, I just now noticed the "Reply" button in perlbug\, sorry. I've sort of replied to your suspicion in the bug report\, but not via that button. So I added this reply to ensure that my previous update doesn't get lost.
Sorry for the inconvenience.
This seems to be even NFS-server dependent. We got a new NFS server a year ago and now the problem does not reproduce any longer. I'll check wehther I can find a server where the problem reproduces.
Migrated from rt.perl.org#81000 (status was 'open')
Searchable as RT81000$