Perl / perl5

🐪 The Perl programming language
https://dev.perl.org/perl5/
Other
1.99k stars 559 forks source link

fork during parsing exhausts parsing file #9811

Open p5pRT opened 15 years ago

p5pRT commented 15 years ago

Migrated from rt.perl.org#68118 (status was 'open')

Searchable as RT68118$

p5pRT commented 15 years ago

From perlbug@plan9.de

Created by perlbug@plan9.de

This program prints "here" only once\, when one would naively expect it to print it twice​:

  BEGIN { fork }   warn "here\n";

I guess this is because the parser exhausts the file\, so the next run will hit EOF immediately\, as this is in the middle of the parse.

I think this should either be fixed\, or the parse file handle be made accessible (perl programs can't really be responsible for implementation details they have no access to - if all the open parser fh's would be accessible it could be made the responsibility of the perl program to get it right).

Perl Info ``` Flags: category=core severity=low Site configuration information for perl 5.10.0: Configured by Marc Lehmann at Sat Feb 21 02:30:27 CET 2009. Summary of my perl5 (revision 5 version 10 subversion 0) configuration: Platform: osname=linux, osvers=2.6.24-etchnhalf.1-amd64, archname=amd64-linux uname='linux cerebro 2.6.24-etchnhalf.1-amd64 #1 smp mon jul 21 10:36:02 utc 2008 x86_64 gnulinux ' config_args='-Duselargefiles -Dxxxxuse64bitint -Uuse64bitall -Dusemymalloc=n -Dcc=gcc -Dccflags=-ggdb -gdwarf-2 -g3 -Dcppflags=-DPERL_ARENA_SIZE=16368 -D_GNU_SOURCE -I/opt/include -Doptimize=-O6 -msse2 -funroll-loops -fno-strict-aliasing -Dcccdlflags=-fPIC -Dldflags=-L/opt/perl/lib -L/opt/lib -Dlibs=-ldl -lm -lcrypt -Darchname=amd64-linux -Dprefix=/opt/perl -Dprivlib=/opt/perl/lib/perl5 -Darchlib=/opt/perl/lib/perl5 -Dvendorprefix=/opt/perl -Dvendorlib=/opt/perl/lib/perl5 -Dvendorarch=/opt/perl/lib/perl5 -Dsiteprefix=/opt/perl -Dsitelib=/opt/perl/lib/perl5 -Dsitearch=/opt/perl/lib/perl5 -Dsitebin=/opt/perl/bin -Dman1dir=/opt/perl/man/man1 -Dman3dir=/opt/perl/man/man3 -Dsiteman1dir=/opt/perl/man/man1 -Dsiteman3dir=/opt/perl/man/man3 -Dman1ext=1 -Dman3ext=3 -Dpager=/usr/bin/less -Uafs -Uusesfio -Uusenm -Uuseshrplib -Dd_dosuid -Dusethreads=undef -Duse5005threads=undef -Duseithreads=undef -Dusemultiplicity=undef -Demail=perl-binary@plan9.de -Dcf_email=perl-binary@plan9.de -Dcf_by=Marc Lehmann -Dlocincpth=/opt/perl/include /opt/include -Dmyhostname=localhost -Dmultiarch=undef -Dbin=/opt/perl/bin -Dxxxusedevel -DxxxDEBUGGING -Dxxxuse_debugging_perl -Dxxxuse_debugmalloc -des' hint=recommended, useposix=true, d_sigaction=define useithreads=undef, usemultiplicity=undef useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef use64bitint=define, use64bitall=undef, uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='gcc', ccflags ='-ggdb -gdwarf-2 -g3 -fno-strict-aliasing -pipe -I/opt/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64', optimize='-O6 -msse2 -funroll-loops -fno-strict-aliasing', cppflags='-DPERL_ARENA_SIZE=16368 -D_GNU_SOURCE -I/opt/include -ggdb -gdwarf-2 -g3 -fno-strict-aliasing -pipe -I/opt/include' ccversion='', gccversion='4.3.2', gccosandvers='' intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16 ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=8, prototype=define Linker and Libraries: ld='gcc', ldflags ='-L/opt/perl/lib -L/opt/lib -L/usr/local/lib' libpth=/usr/local/lib /lib /usr/lib /lib64 /usr/lib64 libs=-ldl -lm -lcrypt perllibs=-ldl -lm -lcrypt libc=/lib/libc-2.7.so, so=so, useshrplib=false, libperl=libperl.a gnulibc_version='2.7' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E' cccdlflags='-fPIC', lddlflags='-shared -O6 -msse2 -funroll-loops -fno-strict-aliasing -L/opt/perl/lib -L/opt/lib -L/usr/local/lib' Locally applied patches: http://public.activestate.com/cgi-bin/perlbrowse/p/34209 http://public.activestate.com/cgi-bin/perlbrowse/p/34507 http://www.gossamer-threads.com/lists/perl/porters/232549 embed.fnc:Perl_vcroak NULLOK @INC for perl 5.10.0: /root/src/sex /opt/perl/lib/perl5 /opt/perl/lib/perl5 /opt/perl/lib/perl5 /opt/perl/lib/perl5 /opt/perl/lib/perl5 . Environment for perl 5.10.0: HOME=/root LANG (unset) LANGUAGE (unset) LC_CTYPE=en_US.UTF-8 LD_LIBRARY_PATH (unset) LOGDIR (unset) PATH=/root/s2:/root/s:/opt/bin:/opt/sbin:/bin:/sbin:/usr/bin:/usr/sbin:/usr/X11/bin:/usr/games:/usr/local/bin:/usr/local/sbin:/root/pserv:. PERL5LIB=/root/src/sex PERL5_CPANPLUS_CONFIG=/root/.cpanplus/config PERLDB_OPTS=ornaments=0 PERL_ANYEVENT_DBI_TESTS=1 PERL_ANYEVENT_EDNS0=1 PERL_ANYEVENT_NET_TESTS=1 PERL_ANYEVENT_PROTOCOLS=ipv4,ipv6 PERL_ANYEVENT_STRICT=1 PERL_BADLANG (unset) PERL_UNICODE=0 SHELL=/bin/bash ```
p5pRT commented 15 years ago

From @rgs

2009/8/2 perlbug@​plan9.de (via RT) \perlbug\-followup@​perl\.org​:

This program prints "here" only once\, when one would naively expect it to print it twice​:

  BEGIN { fork }   warn "here\n";

I guess this is because the parser exhausts the file\, so the next run will hit EOF immediately\, as this is in the middle of the parse.

I think this should either be fixed\, or the parse file handle be made accessible (perl programs can't really be responsible for implementation details they have no access to - if all the open parser fh's would be accessible it could be made the responsibility of the perl program to get it right).

Note that a simple workaround to this behaviour is to use __DATA__ and its filehandle\, and rewind it\, as in​:

seek DATA\,0\,0; print '-'x20\,"\n"; print for \; print '-'x20\,"\n"; __DATA__

p5pRT commented 15 years ago

The RT System itself - Status changed from 'new' to 'open'

p5pRT commented 15 years ago

From schmorp@schmorp.de

On Thu\, Aug 06\, 2009 at 04​:57​:51PM +0200\, Rafael Garcia-Suarez \rgarciasuarez@​gmail\.com wrote​:

I think this should either be fixed\, or the parse file handle be made accessible (perl programs can't really be responsible for implementation details they have no access to - if all the open parser fh's would be accessible it could be made the responsibility of the perl program to get it right).

Note that a simple workaround to this behaviour is to use __DATA__ and its filehandle\, and rewind it\, as in​:

Just stumbled over your reply by accident (you didn't send it to me\, of course).

Please note that your example does not work\, because DATA is not available in a BEGIN block\, nor does it work when the code is used in a module.

The workaround I use in Anyevent​::Watchdog is this\, which is of coruse rather painful\, but works fine​:

Before fork​:

  our %SEEKPOS;   # due to bugs in perl\, try to remember file offsets for all fds\, and restore them later   # (the parser otherwise exhausts the input files)

  # this causes perlio to flush it's handles internally\, so   # seek offsets become correct.   exec "."; # toi toi toi

  # now records all fd positions   for (0 .. 1023) {   open my $fh\, "\<&$_" or next;   $SEEKPOS{$_} = (sysseek $fh\, 0\, 1 or next);   }

After each fork​:

  # restore seek offsets   while (my ($k\, $v) = each %SEEKPOS) {   open my $fh\, "\<&$k" or next;   sysseek $fh\, $v\, 0;   }

The code is so ugly because there is no way to access the file handles in any other way (the parser doesn't expose them)\, and that I need an exec to reify the file offsets so I cna query them (a dummy fork would work\, too\, and is probably cleaner).

Since it seems to work\, I am fine with that as long as I do not have to look at the code. It would be nice if the perl parser would support forking\, however.

--   The choice of a Deliantra\, the free code+content MORPG   -----==- _GNU_ http​://www.deliantra.net   ----==-- _ generation   ---==---(_)__ __ ____ __ Marc Lehmann   --==---/ / _ \/ // /\ \/ / pcg@​goof.com   -=====/_/_//_/\_\,_/ /_/\_\

p5pRT commented 15 years ago

From @rgs

2009/8/26 Marc Lehmann \schmorp@&#8203;schmorp\.de​:

On Thu\, Aug 06\, 2009 at 04​:57​:51PM +0200\, Rafael Garcia-Suarez \rgarciasuarez@&#8203;gmail\.com wrote​:

I think this should either be fixed\, or the parse file handle be made accessible (perl programs can't really be responsible for implementation details they have no access to - if all the open parser fh's would be accessible it could be made the responsibility of the perl program to get it right).

Note that a simple workaround to this behaviour is to use __DATA__ and its filehandle\, and rewind it\, as in​:

Just stumbled over your reply by accident (you didn't send it to me\, of course).

No. Your mail address wasn't in the From or in the Reply-To headers. RT should have forwarded my email to you\, but apparently that did not happen. Am I understanding correctly what RT should do here ?

Please note that your example does not work\, because DATA is not available in a BEGIN block\, nor does it work when the code is used in a module.

Yes. That was a specific workaround for simple cases.

The workaround I use in Anyevent​::Watchdog is this\, which is of coruse rather painful\, but works fine​:

Before fork​:

  our %SEEKPOS;   # due to bugs in perl\, try to remember file offsets for all fds\, and restore them later   # (the parser otherwise exhausts the input files)

  # this causes perlio to flush it's handles internally\, so   # seek offsets become correct.   exec "."; # toi toi toi

  # now records all fd positions   for (0 .. 1023) {      open my $fh\, "\<&$_" or next;      $SEEKPOS{$_} = (sysseek $fh\, 0\, 1 or next);   }

After each fork​:

     # restore seek offsets      while (my ($k\, $v) = each %SEEKPOS) {         open my $fh\, "\<&$k" or next;         sysseek $fh\, $v\, 0;      }

The code is so ugly because there is no way to access the file handles in any other way (the parser doesn't expose them)\, and that I need an exec to reify the file offsets so I cna query them (a dummy fork would work\, too\, and is probably cleaner).

Since it seems to work\, I am fine with that as long as I do not have to look at the code. It would be nice if the perl parser would support forking\, however.

I agree. I think that IlyaZ encountered the same problem some years ago and even proposed the start of the solution.

p5pRT commented 15 years ago

From schmorp@schmorp.de

On Wed\, Aug 26\, 2009 at 11​:06​:55AM +0200\, Rafael Garcia-Suarez \rgarciasuarez@&#8203;gmail\.com wrote​:

Just stumbled over your reply by accident (you didn't send it to me\, of course).

No. Your mail address wasn't in the From or in the Reply-To headers.

Oh right\, rt.cpan.org has this annoying habit of remoivng e-mail addresses.

RT should have forwarded my email to you\, but apparently that did not happen. Am I understanding correctly what RT should do here ?

I don't know - normally I receive replies to perlbug-reports. Not a big deal in any case.

Since it seems to work\, I am fine with that as long as I do not have to look at the code. It would be nice if the perl parser would support forking\, however.

I agree. I think that IlyaZ encountered the same problem some years ago and even proposed the start of the solution.

Well\, I can live with havign to do some extra work - the problem isn't exactly high-priority. It's just that my workaround is beyond evil (especially the dummy exec\, relying on even more internals).

But as it seems to work\, I can live with it for the time being.

(Maybe it should just be documented - that fork doesn't work in BEGIN under windows is actually mentioned somewhere).

--   The choice of a Deliantra\, the free code+content MORPG   -----==- _GNU_ http​://www.deliantra.net   ----==-- _ generation   ---==---(_)__ __ ____ __ Marc Lehmann   --==---/ / _ \/ // /\ \/ / pcg@​goof.com   -=====/_/_//_/\_\,_/ /_/\_\