Perl / perl5

🐪 The Perl programming language
https://dev.perl.org/perl5/
Other
1.88k stars 531 forks source link

[PATCH _63] Slurping empty files #1002

Closed p5pRT closed 20 years ago

p5pRT commented 24 years ago

Migrated from rt.perl.org#1952 (status was 'resolved')

Searchable as RT1952$

p5pRT commented 24 years ago

From hansmu@xs4all.nl

Slurping an empty file should produce a defined empty string.

In current perls (_03 and _63)\, this only happens if $. == 0\, for example​:

  undef $/;
  open F\, "/etc/passwd";   \; # set $. = 1   open F\, "/dev/null";   print defined(\) ? "ok\n" : "not ok\n";

prints​:

  not ok

A comment in pp_hot.c suggests that the IOf_NOLINE flag should flip-flop when it is used. It doesn't do that. For one thing\, $. is incremented when a defined empty string is returned and this implies the IOf_NOLINE flag isn't looked at again until the file handle is closed.

The patch below clears the IOf_NOLINE flag when a handle is opened and sets it on each successful read. It removes the flip-flopping and the reference to $. The result is that you can slurp a defined empty string off an empty file once\, irresepective of $. .

-- HansM

--- doio.c.orig Wed Dec 8 07​:23​:10 1999 *** doio.c Wed Dec 29 02​:44​:07 1999 @​@​ -452\,6 +452\,7 @​@​   } #endif   IoIFP(io) = fp; + IoFLAGS(io) &= ~IOf_NOLINE;   if (writing) {   dTHR;   if (IoTYPE(io) == 's' --- pp_hot.c.orig Wed Dec 8 07​:23​:14 1999 *** pp_hot.c Wed Dec 29 03​:07​:35 1999 @​@​ -1284\,12 +1284\,10 @​@​   offset = 0;   }

-/* flip-flop EOF state for a snarfed empty file */ +/* delay EOF state for a snarfed empty file */ #define SNARF_EOF(gimme\,rs\,io\,sv) \ - ((gimme != G_SCALAR || SvCUR(sv) \ - || (IoFLAGS(io) & IOf_NOLINE) || IoLINES(io) || !RsSNARF(rs)) \ - ? ((IoFLAGS(io) &= ~IOf_NOLINE)\, TRUE) \ - : ((IoFLAGS(io) |= IOf_NOLINE)\, FALSE)) + (gimme != G_SCALAR || SvCUR(sv) \ + || (IoFLAGS(io) & IOf_NOLINE) || !RsSNARF(rs))

  for (;;) {   if (!sv_gets(sv\, fp\, offset) @​@​ -1322\,6 +1320\,7 @​@​   SvTAINTED_on(sv);   }   IoLINES(io)++; + IoFLAGS(io) |= IOf_NOLINE;   SvSETMAGIC(sv);   XPUSHs(sv);   if (type == OP_GLOB) {


Site configuration information for perl 5.00563​:

Configured by hansm at Mon Dec 13 02​:03​:09 MET 1999.

Summary of my perl5 (revision 5.0 version 5 subversion 63) configuration​:   Platform​:   osname=next\, osvers=4_2\, archname=OPENSTEP-Mach   uname='bombadil '   config_args='-des -Dcf_email=hansmu@​xs4all.nl -Dprefix=/usr/local -Doptimize=-g -O'   hint=recommended\, useposix=undef\, d_sigaction=undef   usethreads=undef useperlio=undef d_sfio=undef   use64bits=undef usemultiplicity=undef   Compiler​:   cc='cc'\, optimize='-g -O'\, gccversion=NeXT DevKit-based CPP 4.0   cppflags='-dynamic -fno-common -DUSE_NEXT_CTYPE -DUSE_PERL_SBRK -arch m68k -DDEBUGGING -I/usr/local/include'   ccflags ='-dynamic -fno-common -DUSE_NEXT_CTYPE -DUSE_PERL_SBRK -arch m68k -arch i386 -DDEBUGGING -I/usr/local/include'   stdchar='char'\, d_stdstdio=define\, usevfork=false   intsize=4\, longsize=4\, ptrsize=4\, doublesize=8   d_longlong=define\, longlongsize=8\, d_longdbl=define\, longdblsize=12   alignbytes=8\, usemymalloc=y\, prototype=define   Linker and Libraries​:   ld='cc'\, ldflags =' -dynamic -prebind -arch m68k -arch i386 -L/usr/local/lib'   libpth=/lib /usr/lib /usr/local/lib   libs=   libc=/NextLibrary/Frameworks/System.framework/System\, so=dylib\, useshrplib=true\, libperl=libperl.5.dylib   Dynamic Linking​:   dlsrc=dl_next.xs\, dlext=bundle\, d_dlsymun=undef\, ccdlflags=' '   cccdlflags=' '\, lddlflags=' -dynamic -bundle -undefined suppress -arch m68k -arch i386 -L/usr/local/lib'

Locally applied patches​:  


@​INC for perl 5.00563​:   lib   /Users/hansm/lib/perl   /usr/local/lib/perl5/5.00563/OPENSTEP-Mach   /usr/local/lib/perl5/5.00563   /usr/local/lib/site_perl/5.00563/OPENSTEP-Mach   /usr/local/lib/site_perl   .


Environment for perl 5.00563​:   DYLD_LIBRARY_PATH=/Users/hansm/src/perl/build/perl-5.006/perl5.005_63   HOME=/Users/hansm   LANG (unset)   LANGUAGE (unset)   LD_LIBRARY_PATH (unset)   LOGDIR (unset)   PATH=/Users/hansm/bin​:/usr/local/bin​:/usr/games​:/usr/ucb​:/bin​:/usr/bin​:/usr/etc​:/Users/hansm/bin/cookies​:/LocalApps/Opener.app​:.   PERL5LIB=/Users/hansm/lib/perl   PERL_BADLANG (unset)   SHELL=/usr/bin/zsh

p5pRT commented 24 years ago

From @gsar

On Wed\, 29 Dec 1999 03​:38​:04 +0100\, Hans Mulder wrote​:

Slurping an empty file should produce a defined empty string.

In current perls (_03 and _63)\, this only happens if $. == 0\, for example​:

undef $/;
open F\, "/etc/passwd"; \; # set $. = 1 open F\, "/dev/null"; print defined(\) ? "ok\n" : "not ok\n";

prints​:

not ok

A comment in pp_hot.c suggests that the IOf_NOLINE flag should flip-flop when it is used. It doesn't do that. For one thing\, $. is incremented when a defined empty string is returned and this implies the IOf_NOLINE flag isn't looked at again until the file handle is closed.

The comment is certainly misleading. As I recall\, the IoLINES() check was added mostly as an afterthought to not make it a flip-flop as seen from Perl (as described in perldelta)\, but it appears I overlooked the fact that IoLINES() is not reset by an implicit close. (This means the example in perldelta won't work for more than one file on the command line either.)

The patch below clears the IOf_NOLINE flag when a handle is opened and sets it on each successful read. It removes the flip-flopping and the reference to $. The result is that you can slurp a defined empty string off an empty file once\, irresepective of $. .

The reason the flag was set/cleared in SNARF_EOF() was to avoid adding yet more code to the hot read loop. I'd like to preserve that if possible.

I think your patch effectively does​:

- || (IoFLAGS(io) & IOf_NOLINE) || IoLINES(io) || !RsSNARF(rs)) \ + || (IoFLAGS(io) & IOf_NOLINE) || !RsSNARF(rs)) \

apart from making it an once-only thing rather than a flip-flop. Any reason why we can't do just the above?

Thanks.

Sarathy gsar@​ActiveState.com

p5pRT commented 24 years ago

From @gsar

On Wed\, 29 Dec 1999 11​:28​:50 PST\, I wrote​:

On Wed\, 29 Dec 1999 03​:38​:04 +0100\, Hans Mulder wrote​:

A comment in pp_hot.c suggests that the IOf_NOLINE flag should flip-flop when it is used. It doesn't do that. For one thing\, $. is incremented when a defined empty string is returned and this implies the IOf_NOLINE flag isn't looked at again until the file handle is closed.

The comment is certainly misleading. As I recall\, the IoLINES() check was added mostly as an afterthought to not make it a flip-flop as seen from Perl (as described in perldelta)\, but it appears I overlooked the fact that IoLINES() is not reset by an implicit close. (This means the example in perldelta won't work for more than one file on the command line either.)

The patch below clears the IOf_NOLINE flag when a handle is opened and sets it on each successful read. It removes the flip-flopping and the reference to $. The result is that you can slurp a defined empty string off an empty file once\, irresepective of $. .

The reason the flag was set/cleared in SNARF_EOF() was to avoid adding yet more code to the hot read loop. I'd like to preserve that if possible.

I think your patch effectively does​:

- || (IoFLAGS(io) & IOf_NOLINE) || IoLINES(io) || !RsSNARF(rs)) \ + || (IoFLAGS(io) & IOf_NOLINE) || !RsSNARF(rs)) \

apart from making it an once-only thing rather than a flip-flop. Any reason why we can't do just the above?

This here is what I'm talking about.

Sarathy gsar@​ActiveState.com

Inline Patch ```diff -----------------------------------8<----------------------------------- Change 4736 by gsar@auger on 1999/12/29 21:04:59 slurp mode fix in change#2910 wasn't quite right (spotted by Hans Mulder) Affected files ... ... //depot/perl/doio.c#89 edit ... //depot/perl/pp_hot.c#151 edit ... //depot/perl/t/io/argv.t#10 edit Differences ... ==== //depot/perl/doio.c#89 (text) ==== Index: perl/doio.c --- perl/doio.c.~1~ Wed Dec 29 13:05:03 1999 +++ perl/doio.c Wed Dec 29 13:05:03 1999 @@ -452,6 +452,7 @@ } #endif IoIFP(io) = fp; + IoFLAGS(io) &= ~IOf_NOLINE; if (writing) { dTHR; if (IoTYPE(io) == 's' ==== //depot/perl/pp_hot.c#151 (text) ==== Index: perl/pp_hot.c --- perl/pp_hot.c.~1~ Wed Dec 29 13:05:03 1999 +++ perl/pp_hot.c Wed Dec 29 13:05:03 1999 @@ -1283,12 +1283,11 @@ offset = 0; } -/* flip-flop EOF state for a snarfed empty file */ +/* delay EOF state for a snarfed empty file */ #define SNARF_EOF(gimme,rs,io,sv) \ - ((gimme != G_SCALAR || SvCUR(sv) \ - || (IoFLAGS(io) & IOf_NOLINE) || IoLINES(io) || !RsSNARF(rs)) \ - ? ((IoFLAGS(io) &= ~IOf_NOLINE), TRUE) \ - : ((IoFLAGS(io) |= IOf_NOLINE), FALSE)) + (gimme != G_SCALAR || SvCUR(sv) \ + || !RsSNARF(rs) || (IoFLAGS(io) & IOf_NOLINE) \ + || ((IoFLAGS(io) |= IOf_NOLINE), FALSE)) for (;;) { if (!sv_gets(sv, fp, offset) ==== //depot/perl/t/io/argv.t#10 (xtext) ==== Index: perl/t/io/argv.t --- perl/t/io/argv.t.~1~ Wed Dec 29 13:05:03 1999 +++ perl/t/io/argv.t Wed Dec 29 13:05:03 1999 @@ -5,29 +5,29 @@ unshift @INC, '../lib'; } -print "1..14\n"; +print "1..20\n"; use File::Spec; my $devnull = File::Spec->devnull; -open(try, '>Io.argv.tmp') || (die "Can't open temp file: $!"); +open(try, '>Io_argv1.tmp') || (die "Can't open temp file: $!"); print try "a line\n"; close try; if ($^O eq 'MSWin32') { - $x = `.\\perl -e "while (<>) {print \$.,\$_;}" Io.argv.tmp Io.argv.tmp`; + $x = `.\\perl -e "while (<>) {print \$.,\$_;}" Io_argv1.tmp Io_argv1.tmp`; } else { - $x = `./perl -e 'while (<>) {print \$.,\$_;}' Io.argv.tmp Io.argv.tmp`; + $x = `./perl -e 'while (<>) {print \$.,\$_;}' Io_argv1.tmp Io_argv1.tmp`; } if ($x eq "1a line\n2a line\n") {print "ok 1\n";} else {print "not ok 1\n";} if ($^O eq 'MSWin32') { - $x = `.\\perl -le "print 'foo'" | .\\perl -e "while (<>) {print \$_;}" Io.argv.tmp -`; + $x = `.\\perl -le "print 'foo'" | .\\perl -e "while (<>) {print \$_;}" Io_argv1.tmp -`; } else { - $x = `echo foo|./perl -e 'while (<>) {print $_;}' Io.argv.tmp -`; + $x = `echo foo|./perl -e 'while (<>) {print $_;}' Io_argv1.tmp -`; } if ($x eq "a line\nfoo\n") {print "ok 2\n";} else {print "not ok 2\n";} @@ -39,7 +39,7 @@ } if ($x eq "foo\n") {print "ok 3\n";} else {print "not ok 3 :$x:\n";} -@ARGV = ('Io.argv.tmp', 'Io.argv.tmp', $devnull, 'Io.argv.tmp'); +@ARGV = ('Io_argv1.tmp', 'Io_argv1.tmp', $devnull, 'Io_argv1.tmp'); while (<>) { $y .= $. . $_; if (eof()) { @@ -52,49 +52,74 @@ else {print "not ok 5\n";} -open(try, '>Io.argv.tmp') or die "Can't open temp file: $!"; +open(try, '>Io_argv1.tmp') or die "Can't open temp file: $!"; +close try; +open(try, '>Io_argv2.tmp') or die "Can't open temp file: $!"; close try; -@ARGV = 'Io.argv.tmp'; +@ARGV = ('Io_argv1.tmp', 'Io_argv2.tmp'); $^I = '.bak'; $/ = undef; +my $i = 6; while (<>) { - s/^/ok 6\n/; + s/^/ok $i\n/; + ++$i; print; } -open(try, '; +open(try, '; close try; undef $^I; eof try or print 'not '; -print "ok 7\n"; +print "ok 8\n"; eof NEVEROPENED or print 'not '; -print "ok 8\n"; +print "ok 9\n"; -open STDIN, 'Io.argv.tmp' or die $!; +open STDIN, 'Io_argv1.tmp' or die $!; @ARGV = (); !eof() or print 'not '; -print "ok 9\n"; +print "ok 10\n"; <> eq "ok 6\n" or print 'not '; -print "ok 10\n"; +print "ok 11\n"; open STDIN, $devnull or die $!; @ARGV = (); eof() or print 'not '; -print "ok 11\n"; +print "ok 12\n"; -@ARGV = ('Io.argv.tmp'); +@ARGV = ('Io_argv1.tmp'); !eof() or print 'not '; -print "ok 12\n"; +print "ok 13\n"; @ARGV = ($devnull, $devnull); !eof() or print 'not '; -print "ok 13\n"; +print "ok 14\n"; close ARGV or die $!; eof() or print 'not '; -print "ok 14\n"; +print "ok 15\n"; + +{ + local $/; + open F, 'Io_argv1.tmp' or die; + ; # set $. = 1 + open F, $devnull or die; + print "not " unless defined(); + print "ok 16\n"; + print "not " if defined(); + print "ok 17\n"; + print "not " if defined(); + print "ok 18\n"; + open F, $devnull or die; # restart cycle again + print "not " unless defined(); + print "ok 19\n"; + print "not " if defined(); + print "ok 20\n"; + close F; +} -END { unlink 'Io.argv.tmp', 'Io.argv.tmp.bak' } +END { unlink 'Io_argv1.tmp', 'Io_argv1.tmp.bak', 'Io_argv2.tmp', 'Io_argv2.tmp.bak' } End of Patch. ```
p5pRT commented 24 years ago

From [Unknown Contact. See original ticket]

Gurusamy Sarathy wrote​:

I think your patch effectively does​:

- || (IoFLAGS(io) & IOf_NOLINE) || IoLINES(io) || !RsSNARF(rs)) \ + || (IoFLAGS(io) & IOf_NOLINE) || !RsSNARF(rs)) \

Any reason why we can't do just the above?

That would change behaviour​: in _03\, if you read a number of lines with a defined $/ and then slurp the rest of the file\, you get undef if the rest happens to be 0 bytes long. With your change\, you'd get a defined empty string.

I think that this would be an improvement as it would be more consistent with the behaviour for empty files. But we may have to explain the issue to the Backward Compatibility Police.

And I think we also want to do

+ IoFLAGS(io) &= ~IOf_NOLINE;

somewhere near the end of Perl_do_open9().

-- HansM