Perl / perl5

šŸŖ The Perl programming language
https://dev.perl.org/perl5/
Other
1.98k stars 559 forks source link

File handle doesn't close when out of scope due to \shift #15052

Open p5pRT opened 9 years ago

p5pRT commented 9 years ago

Migrated from rt.perl.org#126676 (status was 'open')

Searchable as RT126676$

p5pRT commented 9 years ago

From @toddr

Created by @toddr

While preparing to move to 5.22\, we noticed a bug which manifests due to our use of \shift. While \shift seems to be a special kind of evil\, Perl does support it so I'd call this a bug for the time being. In the below example\, Perl never decreases the refcount of $fh when it is passed to another subroutine and the passed variable (not a copy of it) is directly used.

This bisects to 7bdb4ff0943cf93297712faf504cdd425426e57f introduced in 5.21.4

If you use $_[0] directly in the subroutine instead of assging $rv to \shift and then accessing $$rv\, then the reference count is 1 after nested2 returns. Also there's a comment in the code below that if you do a print `ls -ld $file`\, then the file has the right file size. Though I'm unsure if it's because something fixes the refcount as a result of the `` action or that some IO buffer ends up getting flushed as a result of the action. I guess we could set a DESTROY on $fh if we wanted to debug that further.

NOTE​: The use of \shift is also currently breaking Devel​::Cover very badly even in 5.14​: https://github.com/pjcj/Devel--Cover/issues/125 I have not investigated if the 5.21.4 change actually fixes the bug in B/Devel​::Cover or if there is also another B bug related to this.

Example code also used to bisect​:

===== foo.pl #!perl

use B qw/svref_2object/;

my $file = "/tmp/file";

sub nested2 {   my ( $rv ) = ( \shift );   open( $$rv\, ">"\, $file) or die; }

sub nested1 {   my $fh;   nested2($fh);   print {$fh} "contents";

  # The ref count on $fh should be 1 now..?   my $obj = svref_2object($fh);   my $count = $obj->REFCNT();   return $count; }

unlink $file; my $refcount = nested1();

$refcount == 1 or die("Ref count should have been 1 after returning from nested2 but it's $refcount instead?");

# If you uncomment this line and comment the above die\, -z is false in perl 5.22 as desired. #print `ls -l $file`; -z $file and die("File has not been closed yet!");

Perl Info ``` Flags: category=core severity=medium Site configuration information for perl 5.22.1: Configured by Todd at Fri Nov 13 01:11:55 CST 2015. Summary of my perl5 (revision 5 version 22 subversion 1) configuration: Platform: osname=linux, osvers=2.6.32-431.29.2.el6.i686, archname=i386-linux-64int uname='linux rpmb-32-centos-65 2.6.32-431.29.2.el6.i686 #1 smp tue sep 9 20:14:52 utc 2014 i686 i686 i386 gnulinux ' config_args='-des -Dusedevel -Darchname=i386-linux-64int -Dcc=/usr/bin/gcc -Dcpp=/usr/bin/cpp -DDEBUGGING=-g -Duselargefiles=yes -Duseposix=true -Dhint=recommended -Duseperlio=yes -Dccflags=-I/usr/local/perl/522/include -L/usr/local/perl/522/lib -I/usr/local/include -L/usr/local/lib -Dcppflags=-I/usr/local/perl/522/include -L/usr/local/perl/522/lib -I/usr/local/include -L/usr/local/lib -Dldflags=-Wl,-rpath -Wl,/usr/local/perl/522/lib -L/usr/local/perl/522/lib -L/usr/local/lib -Dprefix=/usr/local/perl/522 -Dsiteprefix=/opt/perl5/522 -Dsitebin=/opt/perl5/522/bin -Dsitelib=/opt/perl5/522/site_lib -Dusevendorprefix=true -Dvendorbin=/usr/local/perl/522/bin -Dvendorprefix=/usr/local/perl/522/lib/perl5 -Dvendorlib=/usr/local/perl/522/lib/perl5/vendor_lib -Dprivlib=/usr/local/perl/522/lib/perl5/5.22.1 -Dman1dir=none -Dman3dir=none -Dscriptdir=/usr/local/perl/522/bin -Dscriptdirexp=/usr/local/perl/522/bin -Dsiteman1dir=none -Dsiteman3dir=none -Dinstallman1dir=none -Dversiononly=no -Dinstallusrbinperl=no -Dcf_by=Todd -Dmyhostname=localhost -Dperladmin=root@localhost -Di_dbm=/usr/local/include -Di_gdbm=/usr/local/include -Di_ndbm=/usr/local/include -Ud_dosuid -Uuserelocatableinc -Umad -Uusethreads -Uusemultiplicity -Uusesocks -Uuselongdouble -Ui_db -Aldflags=-L/usr/local/perl/522/lib -L/usr/local/lib -L/usr/lib -L/lib -lgdbm -Dlocincpth=/usr/local/perl/522/include /usr/local/include /usr/local/include -Duse64bitint -Uuse64bitall -Acflags=-fPIC -DPIC -m32 -I/usr/local/perl/522/include -I/usr/local/include -Dlibpth=/usr/local/perl/522/lib /usr/local/lib /usr/local/lib /lib /usr/lib ' hint=recommended, useposix=true, d_sigaction=define useithreads=undef, usemultiplicity=undef use64bitint=define, use64bitall=undef, uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='/usr/bin/gcc', ccflags ='-I/usr/local/perl/522/include -L/usr/local/perl/522/lib -I/usr/local/include -L/usr/local/lib -fwrapv -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -D_FORTIFY_SOURCE=2', optimize='-O2 -g', cppflags='-I/usr/local/perl/522/include -L/usr/local/perl/522/lib -I/usr/local/include -L/usr/local/lib -I/usr/local/perl/522/include -L/usr/local/perl/522/lib -I/usr/local/include -L/usr/local/lib -fwrapv -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include' ccversion='', gccversion='4.4.7 20120313 (Red Hat 4.4.7-4)', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=12345678, doublekind=3 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12, longdblkind=3 ivtype='long long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=4, prototype=define Linker and Libraries: ld='/usr/bin/gcc', ldflags ='-Wl,-rpath -Wl,/usr/local/perl/522/lib -L/usr/local/perl/522/lib -L/usr/local/lib -L/usr/local/perl/522/lib -L/usr/local/lib -L/usr/lib -L/lib -lgdbm -fstack-protector -L/usr/local/lib' libpth=/usr/local/perl/522/lib /usr/local/lib /usr/local/lib /lib /usr/lib /usr/local/lib /usr/lib libs=-lpthread -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lc perllibs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc libc=libc-2.12.so, so=so, useshrplib=false, libperl=libperl.a gnulibc_version='2.12' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E' cccdlflags='-fPIC', lddlflags='-shared -O2 -g -L/usr/local/perl/522/lib -L/usr/local/lib -L/usr/lib -L/lib -L/usr/local/lib -fstack-protector' Locally applied patches: RC1 @INC for perl 5.22.1: /usr/local/perl/522/lib/perl5/vendor_lib/i386-linux-64int /usr/local/perl/522/lib/perl5/vendor_lib /usr/local/perl/522/lib/perl5/5.22.1/i386-linux-64int /usr/local/perl/522/lib/perl5/5.22.1 /opt/perl5/522/site_lib/i386-linux-64int /opt/perl5/522/site_lib . Environment for perl 5.22.1: HOME=/root LANG=en_US.UTF-8 LANGUAGE (unset) LD_LIBRARY_PATH (unset) LOGDIR (unset) PATH=/usr/local/perl/522/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin: PERL_BADLANG (unset) SHELL=/bin/zsh ```
p5pRT commented 9 years ago

From @ap

* Todd Rinaldo \perlbug\-followup@​perl\.org [2015-11-18 19​:45]​:

# If you uncomment this line and comment the above die\, -z is false in perl 5.22 as desired. #print `ls -l $file`; -z $file and die("File has not been closed yet!");

$ perldoc -f system | perl -00ne 'print if /flush/'   Beginning with v5.6.0\, Perl will attempt to flush all files opened   for output before any operation that may do a fork\, but this may not   be supported on some platforms (see perlport). To be safe\, you may   need to set $| ($AUTOFLUSH in English) or call the "autoflush()"   method of "IO​::Handle" on any open handles.

p5pRT commented 9 years ago

The RT System itself - Status changed from 'new' to 'open'

p5pRT commented 9 years ago

From @toddr

On Wed Nov 18 17​:02​:21 2015\, aristotle wrote​:

$ perldoc -f system | perl -00ne 'print if /flush/' Beginning with v5.6.0\, Perl will attempt to flush all files opened for output before any operation that may do a fork\, but this may not be supported on some platforms (see perlport). To be safe\, you may need to set $| ($AUTOFLUSH in English) or call the "autoflush()" method of "IO​::Handle" on any open handles.

Thanks! That totally explains the flush. This probably means the file handle is not freed until global destruction.

p5pRT commented 9 years ago

From @iabyn

On Wed\, Nov 18\, 2015 at 10​:39​:46AM -0800\, Todd Rinaldo wrote​:

While preparing to move to 5.22\, we noticed a bug which manifests due to our use of \shift. While \shift seems to be a special kind of evil\, Perl does support it so I'd call this a bug for the time being. In the below example\, Perl never decreases the refcount of $fh when it is passed to another subroutine and the passed variable (not a copy of it) is directly used.

This bisects to 7bdb4ff0943cf93297712faf504cdd425426e57f introduced in 5.21.4

If you use $_[0] directly in the subroutine instead of assging $rv to \shift and then accessing $$rv\, then the reference count is 1 after nested2 returns.

Its not so much an issue with \shift as an issue with anything (such as open()) that attempts to instantiate an anonymous value into a ref to a typeglob; e.g.​:

  use Devel​::Peek;

  my $fh;   my $ref_fh = \$fh;   open( $$ref_fh\, "\<"\, "/dev/null") or die;   Dump $fh;

  my $fh2;   open( $fh2\, "\<"\, "/dev/null") or die;   Dump $fh2;

which gives

  ...   SV = PVGV(0x28a88f8) at 0x286d120   REFCNT = 2   NAME = "_GEN_0"

  ...   SV = PVGV(0x28a8928) at 0x288cf28   REFCNT = 1   NAME = "$fh2"

Note that the first one has an extra ref count\, which is added by the commit you bisected. In fact\, that extra ref count is technically correct\, but does cause the freeing of the file handle to be delayed.

The real issue is how the new typeglob is created\, and whether it's added to the stash or not.

When it's known at compile time what the name the variable that will be instantiated is\, then the typeglob is created with a name that is descriptive of who "owns" it\, e.g. the 'NAME = "$fh2"' above. Sometimes these names can be a bit vague​: for example\, 'open $a[0]\, ...' gives 'NAME = "$a[...]"'\, while 'open $a[0][0]\, ...' gives 'NAME = "__ANONIO__"'.

The two important thing about all those names is that they are determined at compile time (and stored as the targ for the rv2gv op)\, and that the GV is *not* added to the stash. So it only needs a refcount of 1\, and doesn't get delayed in freeing.

On the other hand\, if a name isn't determined at compile time\, as is the case with $$ref\, then the rv2gv's targ is zero\, and it instead creates a a new anonymous typeglob\, with a name of the form _GEN_NNN\, which *is* added to the current stash (and so needs the extra ref).

I think the easiest fix for this is to\, for the latter case\, treat it like the former case\, using a name of '__ANONIO__' (and not add it to any stash).

This is an area I'm not too familiar with\, so I don't know whether there are any drawbacks to this idea.

NOTE​: The use of \shift is also currently breaking Devel​::Cover very badly even in 5.14​: https://github.com/pjcj/Devel--Cover/issues/125 I have not investigated if the 5.21.4 change actually fixes the bug in B/Devel​::Cover or if there is also another B bug related to this.

Well I just tried it on 5.44.4\, 5.20.0 and 5.22.0; I saw problems with the first two consistent with an SV being reused while still owned\, but not with 5.22.0. So it looks like that was fixed with FC's commit.

-- Technology is dominated by two types of people​: those who understand what they do not manage\, and those who manage what they do not understand.

p5pRT commented 9 years ago

From @cpansprout

On Nov 19\, 2015\, at 6​:15 AM\, Dave Mitchell \davem@&#8203;iabyn\.com wrote​:

On the other hand\, if a name isn't determined at compile time\, as is the case with $$ref\, then the rv2gv's targ is zero\, and it instead creates a a new anonymous typeglob\, with a name of the form _GEN_NNN\, which *is* added to the current stash (and so needs the extra ref).

I think the easiest fix for this is to\, for the latter case\, treat it like the former case\, using a name of '__ANONIO__' (and not add it to any stash).

It has been on my ever-growing to-do list for some time to consider whether _GEN_NNN should every be added to the stash. Every time that happens\, itā€™s a leak. Does anybody ever access them by name?

Whether this should be fixed by (a) switching everything to __ANONIO__ or by (b) using _GEN_NNN without adding it to the stash I cannot say. I donā€™t understand why we ended up with an inconsistency in the first place. (Digging in history to find out which one came first might shed some light on it.) Using _GEN_* does allow different handles to be distinguished in error messages\, but __ANONIO__ certainly seems cleaner. For backward-compatibility\, though\, maybe option b is the best choice.

This is an area I'm not too familiar with\, so I don't know whether there are any drawbacks to this idea.

NOTE​: The use of \shift is also currently breaking Devel​::Cover very badly even in 5.14​: https://github.com/pjcj/Devel--Cover/issues/125 I have not investigated if the 5.21.4 change actually fixes the bug in B/Devel​::Cover or if there is also another B bug related to this.

Well I just tried it on 5.44.4\,

Wow!

p5pRT commented 6 years ago

From @toddr

On Thu\, 19 Nov 2015 09​:20​:06 -0800\, sprout wrote​:

It has been on my ever-growing to-do list for some time to consider whether _GEN_NNN should every be added to the stash. Every time that happens\, itā€™s a leak. Does anybody ever access them by name?

Whether this should be fixed by (a) switching everything to __ANONIO__ or by (b) using _GEN_NNN without adding it to the stash I cannot say. I donā€™t understand why we ended up with an inconsistency in the first place. (Digging in history to find out which one came first might shed some light on it.) Using _GEN_* does allow different handles to be distinguished in error messages\, but __ANONIO__ certainly seems cleaner. For backward-compatibility\, though\, maybe option b is the best choice.

FYI\,

We're in the process of submitting a change to Perl​::Critic to discourage the use of \shift. https://github.com/Perl-Critic/Perl-Critic/pull/837

At this point we've now got 4 major versions of perl with this issue.

toddr commented 5 years ago

FYI Perl::Critic now warns people not to do this in code because of this leak.