Perl / perl5

🐪 The Perl programming language
https://dev.perl.org/perl5/
Other
1.96k stars 555 forks source link

File::Find find() with "follow" option fails to invoke "preprocess" #16888

Open p5pRT opened 5 years ago

p5pRT commented 5 years ago

Migrated from rt.perl.org#133931 (status was 'open')

Searchable as RT133931$

p5pRT commented 5 years ago

From mark.maimone@jpl.nasa.gov

Created by mwm@jpl.nasa.gov

  I noticed a disconnect between Find​::File documentation and implementation. It claims that "follow" and "preprocess" options can be used independently\, but if I invoke Find​::Find's "find()" using options that include "follow"ing symlinks\, it never invokes the corresponding "preprocess" coderef option. Looking at the Find.pm source\, it's clear that only _find_dir() ever invokes "preprocess" (via local $pre_process); _find_dir_symlink() (which is called instead of _find_dir() when "follow" is set) never invokes it at all.

  I was able to make things work (obeying "preprocess" hooks regardless of whether "follow" is also set) by duplicating the _find_dir() code in _find_dir_symlink()\, adding the third line here​:

  @​filenames = readdir DIR;   closedir(DIR);   @​filenames = $pre_process->(@​filenames) if $pre_process;

  I would like to suggest including this or a similar change in new releases of File​::Find.pm

Perl Info ``` Flags: category=library severity=medium module=File::Find Site configuration information for perl 5.16.3: Configured by Red Hat, Inc. at Thu Mar 2 08:49:52 EST 2017. Summary of my perl5 (revision 5 version 16 subversion 3) configuration: Platform: osname=linux, osvers=2.6.32-642.13.1.el6.x86_64, archname=x86_64-linux-thread-multi uname='linux x86-017.build.eng.bos.redhat.com 2.6.32-642.13.1.el6.x86_64 #1 smp wed nov 23 16:03:01 est 2016 x86_64 x86_64 x86_64 gnulinux ' config_args='-des -Doptimize=-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -Dccdlflags=-Wl,--enable-new-dtags -Dlddlflags=-shared -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -Wl,-z,relro -DDEBUGGING=-g -Dversion=5.16.3 -Dmyhostname=localhost -Dperladmin=root@localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. -Dprefix=/usr -Dvendorprefix=/usr -Dsiteprefix=/usr/local -Dsitelib=/usr/local/share/perl5 -Dsitearch=/usr/local/lib64/perl5 -Dprivlib=/usr/share/perl5 -Dvendorlib=/usr/share/perl5/vendor_perl -Darchlib=/usr/lib64/perl5 -Dvendorarch=/usr/lib64/perl5/vendor_perl -Darchname=x86_64-linux-thread-multi -Dlibpth=/usr/local/lib64 /lib64 /usr/lib64 -Duseshrplib -Dusethreads -Duseithreads -Dusedtrace=/usr/bin/dtrace -Duselargefiles -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm -Di! _shadow -Di_syslog -Dman3ext=3pm -Duseperlio -Dinstallusrbinperl=n -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/less -isr -Dd_gethostent_r_proto -Ud_endhostent_r_proto -Ud_sethostent_r_proto -Ud_endprotoent_r_proto -Ud_setprotoent_r_proto -Ud_endservent_r_proto -Ud_setservent_r_proto -Dscriptdir=/usr/bin -Dusesitecustomize' hint=recommended, useposix=true, d_sigaction=define useithreads=define, usemultiplicity=define useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef use64bitint=define, use64bitall=define, uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64', optimize='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic', cppflags='-D_REENTRANT -D_GNU_SOURCE -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include' ccversion='', gccversion='4.8.5 20150623 (Red Hat 4.8.5-11)', gccosandvers='' intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16 ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=8, prototype=define Linker and Libraries: ld='gcc', ldflags =' -fstack-protector' libpth=/usr/local/lib64 /lib64 /usr/lib64 libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc -lgdbm_compat perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc libc=, so=so, useshrplib=true, libperl=libperl.so gnulibc_version='2.17' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,--enable-new-dtags -Wl,-rpath,/usr/lib64/perl5/CORE' cccdlflags='-fPIC', lddlflags='-shared -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -Wl,-z,relro ' Locally applied patches: Fedora Patch1: Removes date check, Fedora/RHEL specific Fedora Patch3: support for libdir64 Fedora Patch4: use libresolv instead of libbind Fedora Patch5: USE_MM_LD_RUN_PATH Fedora Patch6: Skip hostname tests, due to builders not being network capable Fedora Patch7: Dont run one io test due to random builder failures Fedora Patch9: Fix find2perl to translate ? glob properly (RT#113054) Fedora Patch10: Fix broken atof (RT#109318) Fedora Patch13: Clear $@ before \"do\" I/O error (RT#113730) Fedora Patch14: Do not truncate syscall() return value to 32 bits (RT#113980) Fedora Patch15: Override the Pod::Simple::parse_file (CPANRT#77530) Fedora Patch16: Do not leak with attribute on my variable (RT#114764) Fedora Patch17: Allow operator after numeric keyword argument (RT#105924) Fedora Patch18: Extend stack in File::Glob::glob, (RT#114984) Fedora Patch19: Do not crash when vivifying $| Fedora Patch20: Fix misparsing of maketext strings (CVE-2012-6329) Fedora Patch21: Add NAME headings to CPAN modules (CPANRT#73396) Fedora Patch22: Fix leaking tied hashes (RT#107000) [1] Fedora Patch23: Fix leaking tied hashes (RT#107000) [2] Fedora Patch24: Fix leaking tied hashes (RT#107000) [3] Fedora Patch25: Fix dead lock in PerlIO after fork from thread (RT#106212) Fedora Patch26: Make regexp safe in a signal handler (RT#114878) Fedora Patch27: Update h2ph(1) documentation (RT#117647) Fedora Patch28: Update pod2html(1) documentation (RT#117623) Fedora Patch29: Document Math::BigInt::CalcEmu requires Math::BigInt (CPAN RT#85015) RHEL Patch30: Use stronger algorithm needed for FIPS in t/op/crypt.t (RT#121591) RHEL Patch31: Make *DBM_File desctructors thread-safe (RT#61912) RHEL Patch32: Use stronger algorithm needed for FIPS in t/op/taint.t (RT#123338) RHEL Patch33: Remove CPU-speed-sensitive test in Benchmark test RHEL Patch34: Make File::Glob work with threads again RHEL Patch35: Fix CRLF conversion in ASCII FTP upload (CPAN RT#41642) RHEL Patch36: Do not leak the temp utf8 copy of namepv (CPAN RT#123786) RHEL Patch37: Fix duplicating PerlIO::encoding when spawning threads (RT#31923) @INC for perl 5.16.3: ....REDACTED.... /usr/local/lib64/perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 . Environment for perl 5.16.3: HOME=/home/mwm LANG=en_US.utf8 LANGUAGE (unset) LD_LIBRARY_PATH=:....REDACTED....:/tps/lib:/usr/lib:/usr/local/lib:/tps/lib:/usr/lib:/usr/X11R6/lib:/sfoc/lib:/sfoc/lib/tcs:/usr/lib:/usr/local/lib:....REDACTED.... LOGDIR (unset) PATH=....REDACTED....:/tps/bin:/tps/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/local/sbin:....REDACTED.... PERL5LIB=....REDACTED.... PERL_BADLANG (unset) SHELL=/bin/tcsh -- ------------------------------------------------------------------------------ Mark Maimone cell : +1 (818) 642 - 7334 NASA Jet Propulsion Lab, Caltech fax: +1 (818) 393 - 2346 http://www-robotics.jpl.nasa.gov/people/Mark_Maimone ```
p5pRT commented 5 years ago

From @tonycoz

On Wed\, 13 Mar 2019 15​:21​:38 -0700\, mark.maimone@​jpl.nasa.gov wrote​:

I noticed a disconnect between Find​::File documentation and implementation. It claims that "follow" and "preprocess" options can be used independently\, but if I invoke Find​::Find's "find()" using options that include "follow"ing symlinks\, it never invokes the corresponding "preprocess" coderef option. Looking at the Find.pm source\, it's clear that only _find_dir() ever invokes "preprocess" (via local $pre_process); _find_dir_symlink() (which is called instead of _find_dir() when "follow" is set) never invokes it at all.

The documentation for File​::Find states​:

  =item C\  
  ... When   I\ or I\<follow_fast> are in effect\, C\ is a no-op.

This text was added in perl-5.6.0-4056-g7e47e6ffb6 (in 2001)

Could you please quote the text that indicates they are independent?

I was able to make things work (obeying "preprocess" hooks regardless of whether "follow" is also set) by duplicating the _find_dir() code in _find_dir_symlink()\, adding the third line here​:

@​filenames = readdir DIR; closedir(DIR); @​filenames = $pre_process->(@​filenames) if $pre_process;

I would like to suggest including this or a similar change in new releases of File​::Find.pm

That said\, we could possibly allow preprocess to work with follow\, but it's an enhancement\, not a bug fix.

Tony

p5pRT commented 5 years ago

The RT System itself - Status changed from 'new' to 'open'

p5pRT commented 5 years ago

From mark.w.maimone@jpl.nasa.gov

  Hi\,  
Tony Cook via RT wrote​:

On Wed\, 13 Mar 2019 15​:21​:38 -0700\, mark.maimone@​jpl.nasa.gov wrote​:

I noticed a disconnect between Find​::File documentation and implementation. It claims that "follow" and "preprocess" options can be used independently\, but if I invoke Find​::Find's "find()" using options that include "follow"ing symlinks\, it never invokes the corresponding "preprocess" coderef option. Looking at the Find.pm source\, it's clear that only _find_dir() ever invokes "preprocess" (via local $pre_process); _find_dir_symlink() (which is called instead of _find_dir() when "follow" is set) never invokes it at all.

The documentation for File​::Find states​:

=3Ditem C\ =20 ... When I\ or I\<follow_fast> are in effect\, C\ is a no-op.

This text was added in perl-5.6.0-4056-g7e47e6ffb6 (in 2001)

Could you please quote the text that indicates they are independent?

  Mea culpa\, I see that now. I must've just read the "follow" text   closely\, which doesn't mention it.

  Maybe if the words "I\" and "I\<follow_fast>" in that   paragraph had the same emphasis as "C\" it'd be even more   obvious? Your quote shows them as italic\, which renders as plain   text in my terminal\, whereas C\<> shows up in double quotes.

=20 I was able to make things work (obeying "preprocess" hooks regardless of whether "follow" is also set) by duplicating the _find_dir() code in _find_dir_symlink()\, adding the third line here​: =20 @​filenames =3D readdir DIR; closedir(DIR); @​filenames =3D $pre_process->(@​filenames) if $pre_process; =20 I would like to suggest including this or a similar change in new releases of File​::Find.pm

That said\, we could possibly allow preprocess to work with follow\, but it's= an enhancement\, not a bug fix.

  Understood. It does seem like a helpful and simple fix\, it'd be a   nice enhancement. I have a filesystem with lots of symlinks\, and I   found I wasn't able to get the search to terminate in a reasonable   amount of time unless I took advantage of the "preprocess" hook to   prune away unneeded search branches.

  Thanks much for your reply!

  Mark M.  
--


Mark Maimone cell : +1 (818) 642 - 7334 NASA Jet Propulsion Lab\, Caltech fax​: +1 (818) 393 - 2346 http​://www-robotics.jpl.nasa.gov/people/Mark_Maimone