Perl / perl5

🐪 The Perl programming language
https://dev.perl.org/perl5/
Other
1.88k stars 530 forks source link

F:S->catdir goes wrong with double slash #12021

Open p5pRT opened 12 years ago

p5pRT commented 12 years ago

Migrated from rt.perl.org#112054 (status was 'open')

Searchable as RT112054$

p5pRT commented 12 years ago

From zefram@fysh.org

Created by zefram@fysh.org

The logic in File​::Spec​::Unix->catdir fails to take account of the double-slashes-special case in ->canonpath. As a result it produces incorrect answers in cases like this​:

$ perl -MFile​::Spec -lwe '$^O="qnx"; print File​::Spec->catdir("/"\, "foo"\, "bar")' //foo/bar

The correct result would be "/foo/bar"\, the same as on a non-double-slash platform. You could get the right answer by using "///" instead of "/" as the separator in the join() expression.

Perl Info ``` Flags: category=library severity=low module=File::Spec Site configuration information for perl 5.14.2: Configured by root at Fri Sep 30 09:50:04 UTC 2011. Summary of my perl5 (revision 5 version 14 subversion 2) configuration: Platform: osname=linux, osvers=2.6.32-5-686, archname=i686-linux-64int-ld uname='linux ukmcwzefram.photobox.priv 2.6.32-5-686 #1 smp fri sep 9 20:51:05 utc 2011 i686 gnulinux ' config_args='-des -Duseshrplib -Duse64bitint -Duselongdouble -Uusethreads -Uusemultiplicity -Dprefix=/opt/perl-5.14.2 -Dsiteprefix=/opt/perl-5.14.2 -Dvendorprefix=/opt/perl-5.14.2/vendor -Dcccdlflags=-fPIC -O2 -pipe' hint=recommended, useposix=true, d_sigaction=define useithreads=undef, usemultiplicity=undef useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef use64bitint=define, use64bitall=undef, uselongdouble=define usemymalloc=n, bincompat5005=undef Compiler: cc='cc', ccflags ='-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64', optimize='-O2', cppflags='-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include' ccversion='', gccversion='4.4.5', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=12345678 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12 ivtype='long long', ivsize=8, nvtype='long double', nvsize=12, Off_t='off_t', lseeksize=8 alignbytes=4, prototype=define Linker and Libraries: ld='cc', ldflags =' -fstack-protector -L/usr/local/lib' libpth=/usr/local/lib /lib/../lib /usr/lib/../lib /lib /usr/lib /lib64 /usr/lib64 libs=-lnsl -ldb -ldl -lm -lcrypt -lutil -lc perllibs=-lnsl -ldl -lm -lcrypt -lutil -lc libc=/lib/libc-2.11.2.so, so=so, useshrplib=true, libperl=libperl.so gnulibc_version='2.11.2' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E -Wl,-rpath,/opt/perl-5.14.2/lib/5.14.2/i686-linux-64int-ld/CORE' cccdlflags='-fPIC -O2 -pipe', lddlflags='-shared -O2 -L/usr/local/lib -fstack-protector' Locally applied patches: @INC for perl 5.14.2: /opt/perl-5.14.2/lib/site_perl/5.14.2/i686-linux-64int-ld /opt/perl-5.14.2/lib/site_perl/5.14.2 /opt/perl-5.14.2/vendor/lib/vendor_perl/5.14.2/i686-linux-64int-ld /opt/perl-5.14.2/vendor/lib/vendor_perl/5.14.2 /opt/perl-5.14.2/lib/5.14.2/i686-linux-64int-ld /opt/perl-5.14.2/lib/5.14.2 . Environment for perl 5.14.2: HOME=/home/zefram LANG (unset) LANGUAGE (unset) LD_LIBRARY_PATH (unset) LOGDIR (unset) PATH=/home/zefram/pub/i686-pc-linux-gnu/bin:/home/zefram/pub/common/bin:/usr/bin:/bin:/usr/local/bin:/usr/games:/opt/perl/bin PERL_BADLANG (unset) SHELL=/usr/bin/zsh ```
p5pRT commented 12 years ago

From @ikegami

On Tue\, Mar 27\, 2012 at 10​:25 AM\, Zefram \perlbug\-followup@​perl\.org wrote​:

$ perl -MFile​::Spec -lwe '$^O="qnx"; print File​::Spec->catdir("/"\, "foo"\, "bar")' //foo/bar

The correct result would be "/foo/bar"

A note of caution​: Double leading slashes is meaningful in Windows\, so don't "fix" it for Windows too.

p5pRT commented 12 years ago

The RT System itself - Status changed from 'new' to 'open'

p5pRT commented 12 years ago

From @demerphq

On 27 March 2012 16​:25\, Zefram \perlbug\-followup@​perl\.org wrote​:

# New Ticket Created by  Zefram # Please include the string​:  [perl #112054] # in the subject line of all future correspondence about this issue. # \<URL​: https://rt-archive.perl.org/perl5/Ticket/Display.html?id=112054 >

This is a bug report for perl from zefram@​fysh.org\, generated with the help of perlbug 1.39 running under perl 5.14.2.

----------------------------------------------------------------- [Please describe your issue here]

The logic in File​::Spec​::Unix->catdir fails to take account of the double-slashes-special case in ->canonpath.  As a result it produces incorrect answers in cases like this​:

$ perl -MFile​::Spec -lwe '$^O="qnx"; print File​::Spec->catdir("/"\, "foo"\, "bar")' //foo/bar

The correct result would be "/foo/bar"\, the same as on a non-double-slash platform.  You could get the right answer by using "///" instead of "/" as the separator in the join() expression.

Hi Zefram\, I think this ticket should be closed as not-a-bug. catdir is documented as the opposite of splitdir. For the path you have entered splitdir outputs the following​:

yorton@​spud​:\~$ perl -MFile​::Spec -MData​::Dumper -e'my @​dirs= File​::Spec->splitdir("/foo/bar/"); print Data​::Dumper​::Dumper(\@​dirs)' $VAR1 = [   ''\,   'foo'\,   'bar'\,   ''   ];

Therefore the input is incorrect\, and the output is the expected result given the incorrect input. Changing this behavior has deep consequences on non-unix boxen (i have been bitten by this stuff before).

Specifically your expectation that "/" is a valid directory name is incorrect. It is not a valid directory name\, it is a separator following the directory with the empty string as a name. (Any other definition falls on its face)

cheers\, Yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

p5pRT commented 12 years ago

From zefram@fysh.org

demerphq wrote​:

Hi Zefram\, I think this ticket should be closed as not-a-bug. catdir is documented as the opposite of splitdir.

That's a good point\, but one could also argue that splitdir needs to recognise "//". However\, splitdir's treatment of interior double slashes is not only of this broken-looking form but also documented. Tricky.

Specifically your expectation that "/" is a valid directory name is incorrect. It is not a valid directory name\,

It's the output of canonpath for anything referring to the root directory. It patently is not only a valid directory name\, but the *preferred* name for the directory it refers to.

-zefram

p5pRT commented 12 years ago

From @xdg

On Tue\, Mar 27\, 2012 at 12​:38 PM\, demerphq \demerphq@&#8203;gmail\.com wrote​:

Hi Zefram\, I think this ticket should be closed as not-a-bug.  catdir is documented as the opposite of splitdir. For the path you have entered splitdir outputs the following​:

The *general* behavior is not a bug (though the docs could be more clear about what happens when you give a directory separator as one of the arguments).

However\, the behavior is "qnx" specific. From File​::Spec​::Unix​::canonpath​:

  # Handle POSIX-style node names beginning with double slash (qnx\, nto)   # (POSIX says​: "a pathname that begins with two successive slashes   # may be interpreted in an implementation-defined manner\, although   # more than two leading slashes shall be treated as a single slash.")   my $node = '';   my $double_slashes_special = $^O eq 'qnx' || $^O eq 'nto';

On my linux system\, I don't get double slashes​:

  $ perl -MFile​::Spec​::Functions=catdir\,canonpath -wE '$^O="qnx"; my $d = catdir("/"\, "foo"\, "bar"); say $d; say canonpath($d)'   //foo/bar   //foo/bar

  $ perl -MFile​::Spec​::Functions=catdir\,canonpath -wE 'my $d = catdir("/"\, "foo"\, "bar"); say $d; say canonpath($d)'   /foo/bar   /foo/bar

p5pRT commented 12 years ago

From zefram@fysh.org

Eric Brine wrote​:

A note of caution​: Double leading slashes is meaningful in Windows\, so don't "fix" it for Windows too.

This raises a general issue with File​::Spec​: is the option of externally subclassing it part of the API? If it is\, it becomes rather difficult to change any of the existing behaviour\, especially of F​:S​:Unix. If external subclassing behaviour doesn't have to be maintained\, we're free to fix things in F​:S​:Unix\, as long as we modify the core F​:S​:* subclasses to maintain their own behaviour.

And a related question​: is File​::Spec intended to implement system X's filename semantics on system Y\, or only the runtime host's filename semantics? It's certainly not doing the former 100%\, for example with that double-slashes-special condition looking at $^O. And that impairs the subclassability of F​:S​:Unix. So this issue influences the preceding.

I'm actually working on making File​::Spec perform better\, because it's very hot in $ork application\, via Cache​::CacheUtils. Optimisation would be quite a bit easier if I didn't have to maintain external subclassability.

-zefram

p5pRT commented 12 years ago

From @demerphq

On 27 March 2012 18​:54\, Zefram \zefram@&#8203;fysh\.org wrote​:

demerphq wrote​:

Hi Zefram\, I think this ticket should be closed as not-a-bug.  catdir is documented as the opposite of splitdir.

That's a good point\, but one could also argue that splitdir needs to recognise "//".  However\, splitdir's treatment of interior double slashes is not only of this broken-looking form but also documented.  Tricky.

Specifically your expectation that "/" is a valid directory name is incorrect. It is not a valid directory name\,

It's the output of canonpath

On what OS? It wont be the output of canonpath on Windows\, nor VMS.

On a side note\, what I remember is that *nixans tend to not use splitpath() like they should before they use splitdir.

for anything referring to the root directory. It patently is not only a valid directory name\, but the *preferred* name for the directory it refers to.

I'm not going to argue that what you say here is not the conventional model\, but it is IMO a broken one. Slashes are separators\, not names.

cheers\, Yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

p5pRT commented 12 years ago

From @jandubois

On Tue\, 27 Mar 2012\, Zefram wrote​:

And a related question​: is File​::Spec intended to implement system X's filename semantics on system Y\, or only the runtime host's filename semantics?

I believe it once was supposed to do that. E.g. abs2rel explicitly doesn't access the filesystem to make the conversion. Everything was supposed to be about the *specification* of volume\, directory and filenames\, not about actual file system characteristics.

But exceptions had to be made for VMS.

And tmpdir has to inspect environment variables.

But the biggest offender in my mind is case_tolerant() which has been pushed into File​::Spec but really doesn't seem to belong at all. It is also overly simplistic\, as you could potentially have both case sensitive and case insensitive segments in the path of a single file\, depending on mount options and locations of mount points. But File​::Spec​::Unix just assumes that Unix systems are always case sensitive\, which on OS X is most commonly not true.

Sorry for the rambling; but I think is shows that there is no overriding design objective to the module anymore.

Cheers\, -Jan

p5pRT commented 12 years ago

From @ikegami

On Tue\, Mar 27\, 2012 at 12​:54 PM\, Zefram \zefram@&#8203;fysh\.org wrote​:

demerphq wrote​:

Specifically your expectation that "/" is a valid directory name is

incorrect. It is not a valid directory name\,

It's the output of canonpath for anything referring to the root directory. It patently is not only a valid directory name\, but the *preferred* name for the directory it refers to.

It's the path to the (unix) root directory. The root directory doesn't have a name. catpath takes a series of (system-independent) directory names\, where the empty string is used to indicate the (otherwise nameless) root directory.

- Eric

p5pRT commented 12 years ago

From @ikegami

On Tue\, Mar 27\, 2012 at 1​:06 PM\, Zefram \zefram@&#8203;fysh\.org wrote​:

And a related question​: is File​::Spec intended to implement system X's filename semantics on system Y\, or only the runtime host's filename semantics?

I think so. I've never seen anything forbidden it\, and I've seen people using that feature. The most notable is surely Path​::Class.

p5pRT commented 12 years ago

From @xdg

On Tue\, Mar 27\, 2012 at 1​:06 PM\, Zefram \zefram@&#8203;fysh\.org wrote​:

This raises a general issue with File​::Spec​: is the option of externally subclassing it part of the API?  If it is\, it becomes rather difficult to change any of the existing behaviour\, especially of F​:S​:Unix.  If external subclassing behaviour doesn't have to be maintained\, we're free to fix things in F​:S​:Unix\, as long as we modify the core F​:S​:* subclasses to maintain their own behaviour.

It's tricky because File​::Spec follows Ken William's pattern where "use File​::Spec" dynamically makes File​::Spec a subclass of File​::Spec​::$OS. Otherwise\, File​::Spec itself has no behaviors.

So you can subclass\, say\, File​::Spec​::Unix and do whatever you need. Or you can unshift something onto @​File​::Spec​::ISA for a global effect.

And a related question​: is File​::Spec intended to implement system X's filename semantics on system Y\, or only the runtime host's filename semantics?

The typical pattern is to use the subclass directly. File​::Spec​::Unix->catdir(...)

It's certainly not doing the former 100%\, for example with that double-slashes-special condition looking at $^O.  And that impairs the subclassability of F​:S​:Unix.  So this issue influences the preceding.

The use of $^O in File​::Spec​::Unix is a design flaw\, IMO. Those OSes should have been broken out into separate subclasses. (And probably still should be.)

I'm actually working on making File​::Spec perform better\, because it's very hot in $ork application\, via Cache​::CacheUtils.  Optimisation would be quite a bit easier if I didn't have to maintain external subclassability.

I'd love to see some sort of caching optimization built into File​::Spec itself\, since I bet it's hot code in a lot of cases. I've heard that canonpath() would be good to optimize\, as just about every other method uses it and most of the time it should be an identity operation.

-- David

p5pRT commented 12 years ago

From @craigberry

On Tue\, Mar 27\, 2012 at 12​:55 PM\, Eric Brine \ikegami@&#8203;adaelis\.com wrote​:

On Tue\, Mar 27\, 2012 at 1​:06 PM\, Zefram \zefram@&#8203;fysh\.org wrote​:

And a related question​: is File​::Spec intended to implement system X's filename semantics on system Y\, or only the runtime host's filename semantics?

I think so. I've never seen anything forbidden it\, and I've seen people using that feature. The most notable is surely Path​::Class.

File​::Spec's tests run all the variants on all platforms except the File​::Spec​::VMS tests\, which can only be run on VMS. That's because File​::Spec​::VMS depends on VMS​::Filespec\, which is implemented in vms/vms.c and calls a lot of native services.

The real problem with the module is that whatever design integrity it once had has had holes shot through it by people patching various parts of it to solve various platform-specific problems. For example\, canonpath was originally intended to eliminate redundant path elements (foo/../bar.dat --> bar.dat). But then someone decided it shouldn't do that on Unix because it might give the wrong answer if one of the eliminated elements was a symlink. But no analogous change was made on Win32 or VMS\, so now the routine does entirely different and dissimilar things on different platforms and it's no longer clear what its purpose is.

p5pRT commented 12 years ago

From @rurban

On Tue\, Mar 27\, 2012 at 12​:31 PM\, Jan Dubois \jand@&#8203;activestate\.com wrote​:

On Tue\, 27 Mar 2012\, Zefram wrote​:

And a related question​: is File​::Spec intended to implement system X's filename semantics on system Y\, or only the runtime host's filename semantics?

I believe it once was supposed to do that. E.g. abs2rel explicitly doesn't access the filesystem to make the conversion. Everything was supposed to be about the *specification* of volume\, directory and filenames\, not about actual file system characteristics.

But exceptions had to be made for VMS.

And tmpdir has to inspect environment variables.

But the biggest offender in my mind is case_tolerant() which has been pushed into File​::Spec but really doesn't seem to belong at all. It is also overly simplistic\, as you could potentially have both case sensitive and case insensitive segments in the path of a single file\, depending on mount options and locations of mount points. But File​::Spec​::Unix just assumes that Unix systems are always case sensitive\, which on OS X is most commonly not true.

Sorry for the rambling; but I think is shows that there is no overriding design objective to the module anymore.

Fully agree. case_tolerant is not OS but mount-pount specific\, and our File/Path libs do not care about mount options.

And Module​::Build is only so slow because it abuses case_tolerant to its maximum - and because I made the call on Win32 so expensive. -- Reini

p5pRT commented 12 years ago

From @wb8tyw

On 3/27/2012 12​:06 PM\, Zefram wrote​:

I'm actually working on making File​::Spec perform better\, because it's very hot in $ork application\, via Cache​::CacheUtils. Optimisation would be quite a bit easier if I didn't have to maintain external subclassability.

No good deed will go unpunished.

The problem with fixing "bugs" in File​::Spec is that there is a body of code that expects the current behavior as a "Feature".

When working on updating Perl on VMS to handle the newer extended character set in both VMS and Unix syntax\, much of my work outside of vms.c involved code that called File​::Spec methods\, and I noticed that quite a few of those calls are wrapped by tests for $^O\, which indicates that they are not really portable.

It has been a few years since I directly looked at that section\, so from memory​:

I tried to introduce a fix to one of the tests in perl\, (I forget which one) that used File​::Spec->splitpath() and catpath() but it got reverted because the resulting code would not work correctly on Win32\, unless Win32 called the same File​::Spec methods differently.

Some of the problems seem to be from that Win32 and VMS can have relative directory path and a volume specification\, and that a null directory path as the first directory to be concatenated was being used to indicate an absolute directory to catdir() and catpath()

VMS has things like VOL​:[.DIR]FILE and vol​:file\, where portions of the file specification can be defaulted or covered by a logical name.

Win32 has x​:dir\bar\, with a different default directory for each volume letter.

Regards\, -John wb8tyw@​qsl.network Personal Opinion Only

p5pRT commented 12 years ago

From @nwc10

On Tue\, Mar 27\, 2012 at 03​:26​:04PM -0400\, David Golden wrote​:

On Tue\, Mar 27\, 2012 at 1​:06 PM\, Zefram \zefram@&#8203;fysh\.org wrote​:

It's certainly not doing the former 100%\, for example with that double-slashes-special condition looking at $^O.  And that impairs the subclassability of F​:S​:Unix.  So this issue influences the preceding.

The use of $^O in File​::Spec​::Unix is a design flaw\, IMO. Those OSes should have been broken out into separate subclasses. (And probably still should be.)

Yes\, they should. I remember them bugging me.

Do we know anyone using QNX or Neutrino?

I'm actually working on making File​::Spec perform better\, because it's very hot in $ork application\, via Cache​::CacheUtils.  Optimisation would be quite a bit easier if I didn't have to maintain external subclassability.

I'd love to see some sort of caching optimization built into File​::Spec itself\, since I bet it's hot code in a lot of cases. I've heard that canonpath() would be good to optimize\, as just about every other method uses it and most of the time it should be an identity operation.

Can't one just use Memoize on it?

Nicholas Clark

p5pRT commented 12 years ago

From @nwc10

On Tue\, Mar 27\, 2012 at 02​:49​:02PM -0500\, Craig A. Berry wrote​:

The real problem with the module is that whatever design integrity it once had has had holes shot through it by people patching various parts of it to solve various platform-specific problems. For example\, canonpath was originally intended to eliminate redundant path elements (foo/../bar.dat --> bar.dat). But then someone decided it shouldn't do that on Unix because it might give the wrong answer if one of the eliminated elements was a symlink. But no analogous change was made

That change seems to predate the addition of File​::Spec to the core in 1998

on Win32 or VMS\, so now the routine does entirely different and dissimilar things on different platforms and it's no longer clear what its purpose is.

The reasoning would have been that Win32 and VMS don't have symlinks\, so it's safe to tidy foo/../ on them?

Nicholas Clark

p5pRT commented 12 years ago

From @demerphq

On 28 March 2012 12​:25\, Nicholas Clark \nick@&#8203;ccl4\.org wrote​:

On Tue\, Mar 27\, 2012 at 02​:49​:02PM -0500\, Craig A. Berry wrote​:

The real problem with the module is that whatever design integrity it once had has had holes shot through it by people patching various parts of it to solve various platform-specific problems.  For example\, canonpath was originally intended to eliminate redundant path elements (foo/../bar.dat --> bar.dat).  But then someone decided it shouldn't do that on Unix because it might give the wrong answer if one of the eliminated elements was a symlink.  But no analogous change was made

That change seems to predate the addition of File​::Spec to the core in 1998

on Win32 or VMS\, so now the routine does entirely different and dissimilar things on different platforms and it's no longer clear what its purpose is.

The reasoning would have been that Win32 and VMS don't have symlinks\, so it's safe to tidy foo/../ on them?

They have junctions tho\, which are pretty close to the same as a symlink\, so I dont think that could be it.

Win32 actually has OS level equivalents for most of File​::Spec which are available through Win32.pm afair.

cheers Yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

p5pRT commented 12 years ago

From @wb8tyw

On 3/28/2012 6​:31 AM\, demerphq wrote​:

On 28 March 2012 12​:25\, Nicholas Clark\nick@&#8203;ccl4\.org wrote​:

On Tue\, Mar 27\, 2012 at 02​:49​:02PM -0500\, Craig A. Berry wrote​:

The real problem with the module is that whatever design integrity it once had has had holes shot through it by people patching various parts of it to solve various platform-specific problems. For example\, canonpath was originally intended to eliminate redundant path elements (foo/../bar.dat --> bar.dat). But then someone decided it shouldn't do that on Unix because it might give the wrong answer if one of the eliminated elements was a symlink. But no analogous change was made

That change seems to predate the addition of File​::Spec to the core in 1998

on Win32 or VMS\, so now the routine does entirely different and dissimilar things on different platforms and it's no longer clear what its purpose is.

The reasoning would have been that Win32 and VMS don't have symlinks\, so it's safe to tidy foo/../ on them?

They have junctions tho\, which are pretty close to the same as a symlink\, so I dont think that could be it.

Win32 actually has OS level equivalents for most of File​::Spec which are available through Win32.pm afair.

VMS has logical names\, which are used quite a bit like symlinks are used on Unix.

Calling unixify() on VMS can remove the logical name (a bug) so that calling vmsify() no longer returns the same path\, which is pretty much the same issue with canonpath on Unix removing "foo/../" to cause a symbolic link to be skipped.

And since File​::Spec​::VMS originally called both of those methods routinely\, the resulting specification from using split path and cat path on a directory could result in significantly different results.

Because I did not want to risk breaking existing programs that depended on the bugs\, File​::Spec​::VMS now checks an environment variable to see if the VMS Extended character set is in use\, and when it is\, it no longer calls unixify() or vmsify(). Also when that same environment variable is set\, unixify no longer incorrectly translates logical names.

With VMS 8.2\, VMS does have symlinks.

Regards\, -John wb8tyw@​qsl.network Personal Opinion Only

p5pRT commented 12 years ago

From @xdg

On Wed\, Mar 28\, 2012 at 7​:31 AM\, demerphq \demerphq@&#8203;gmail\.com wrote​:

They have junctions tho\, which are pretty close to the same as a symlink\, so I dont think that could be it.

I suspect File​::Spec's behavior was written before junctions were an "official" part of the OS. Ditto VMS' support for them.

That seems like bitrot in File​::Spec unrelated to the bug itself\, i.e. it should be a separate ticket.

The issue at hand for perl #112054 is why the heck is there special double slash handling for qnx and nto? And if they do need it\, should there be separate modules for those OSes so people can more cleanly subclass File​::Spec?

-- David

p5pRT commented 12 years ago

From @jandubois

On Wed\, 28 Mar 2012\, demerphq wrote​:

On 28 March 2012 12​:25\, Nicholas Clark \nick@&#8203;ccl4\.org wrote​:

On Tue\, Mar 27\, 2012 at 02​:49​:02PM -0500\, Craig A. Berry wrote​:

The real problem with the module is that whatever design integrity it once had has had holes shot through it by people patching various parts of it to solve various platform-specific problems.  For example\, canonpath was originally intended to eliminate redundant path elements (foo/../bar.dat --> bar.dat).  But then someone decided it shouldn't do that on Unix because it might give the wrong answer if one of the eliminated elements was a symlink.  But no analogous change was made

That change seems to predate the addition of File​::Spec to the core in 1998

on Win32 or VMS\, so now the routine does entirely different and dissimilar things on different platforms and it's no longer clear what its purpose is.

The reasoning would have been that Win32 and VMS don't have symlinks\, so it's safe to tidy foo/../ on them?

They have junctions tho\, which are pretty close to the same as a symlink\, so I dont think that could be it.

Junctions have been added to NTFS in Windows 2000\, but they aren't that close to symlinks​: they can only point to directories on the local volumes. Junctions are implemented via NTFS reparse points\, which have mainly been added to support mount points for hierarchical storage systems.

NTFS on Windows Vista and later implements another kind of linking (again via reparse points) that are actually called symlinks and are supposed to be "just like on Unix to support migration to Windows". They do allow links to files\, and even to non-existing remote targets. But you have to specify the type (file/directory) when you create them\, and by default only processes running with elevated administrator privileges can create them. So I think they still miss the mark by quite a bit...

Cheers\, -Jan

p5pRT commented 12 years ago

From zefram@fysh.org

David Golden wrote​:

The issue at hand for perl #112054 is why the heck is there special double slash handling for qnx and nto?

It's a scheme to allow pathnames to refer to filesystems of other machines\, in a way that's distinct from the entire local filesystem. So "//brillig/etc/passwd" refers to the /etc/passwd file on the machine known as "brillig". The other scheme for this that used to have some currency is to make "/.." refer to a "superroot" directory\, notionally located above the local root. So "/../brillig/etc/passwd" could serve the same function. POSIX endorsed the "//" scheme\, by including specific wording to permit a sequence of exactly two leading slashes to be special. It requires "///" and "/.." to refer to the same thing as "/".

Of course\, the modern way is to mount a virtual filesystem *under* the local root directory. So "/net/brillig/etc/passwd" is a common arrangement. This is syntactically perfectly well behaved\, requiring no special rules at all.

-zefram

p5pRT commented 12 years ago

From @xdg

On Wed\, Mar 28\, 2012 at 4​:38 PM\, Zefram \zefram@&#8203;fysh\.org wrote​:

It's a scheme to allow pathnames to refer to filesystems of other machines\, in a way that's distinct from the entire local filesystem. So "//brillig/etc/passwd" refers to the /etc/passwd file on the machine known as "brillig".

Ah. It's like "\\brillig\path\somewhere" on Windows.

So... would splitting the qnx/nto behavior out to subclasses "resolve" this bug?

-- David

p5pRT commented 12 years ago

From zefram@fysh.org

David Golden wrote​:

So... would splitting the qnx/nto behavior out to subclasses "resolve" this bug?

No. That subclass would exhibit the bug. It's the behaviour for qnx/nto that matters\, not how it's organised.

-zefram

p5pRT commented 12 years ago

From @nwc10

On Wed\, Mar 28\, 2012 at 09​:38​:41PM +0100\, Zefram wrote​:

Of course\, the modern way is to mount a virtual filesystem *under* the local root directory. So "/net/brillig/etc/passwd" is a common arrangement. This is syntactically perfectly well behaved\, requiring   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

no special rules at all.

Your wording being (as usual) very carefully chosen\, because of course this permits one to mount a case insensitive file system within a case sensitive file system. Which in turn means that the question "is this path case sensitive?" is not a good question to ask expecting "yes" or "no"\, as the answer of "this bit is\, that bit is not" is not usually something programmers expect.

But this is a digression from the subject.

[But not a digression from File​::Spec\, as it provides case_tolerant() and pretends that it can give a meaningful answer]

Nicholas Clark

p5pRT commented 12 years ago

From @demerphq

On 29 March 2012 10​:29\, Nicholas Clark \nick@&#8203;ccl4\.org wrote​:

On Wed\, Mar 28\, 2012 at 09​:38​:41PM +0100\, Zefram wrote​:

Of course\, the modern way is to mount a virtual filesystem *under* the local root directory.  So "/net/brillig/etc/passwd" is a common arrangement.  This is syntactically perfectly well behaved\, requiring                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

no special rules at all.

Your wording being (as usual) very carefully chosen\, because of course this permits one to mount a case insensitive file system within a case sensitive file system. Which in turn means that the question "is this path case sensitive?" is not a good question to ask expecting "yes" or "no"\, as the answer of "this bit is\, that bit is not" is not usually something programmers expect.

But this is a digression from the subject.

[But not a digression from File​::Spec\, as it provides case_tolerant() and pretends that it can give a meaningful answer]

Lets be careful we dont fall into the "perfection is the enemy of good enough" pitfall.

In other words\, yes it might be true that a function like case_tolerant() fails to handle various edge cases\, and yes it might be the case that one simply *cant* shoe-horn reality into the abstraction model that case_tolerant() exists within. But that does not rule out case_tolerant() performing a useful service most of the time for most of its users.

Yvves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

p5pRT commented 12 years ago

From @ap

* demerphq \demerphq@&#8203;gmail\.com [2012-03-29 10​:55]​:

Lets be careful we dont fall into the "perfection is the enemy of good enough" pitfall.

In other words\, yes it might be true that a function like case_tolerant() fails to handle various edge cases\, and yes it might be the case that one simply *cant* shoe-horn reality into the abstraction model that case_tolerant() exists within. But that does not rule out case_tolerant() performing a useful service most of the time for most of its users.

I guess the simplest route to making the API allow for perfection is to redefine case_tolerant as asking “are there *any* case tolerant parts in the path” and adding a function case_sensitive to ask “are there any case *in*tolerant parts in the path” – which can then both be true at the same time\, though one of them can be false (but at least one of them has to be true). With that change\, existing code can continue to “work” no worse than before\, while it awaits fixing to ask both questions. And the API for simple cases remains easy (two predicates to check in a conditional\, instead of receiving and testing some kind of compound data structure) – only code that wants to 100% handle all cases in all cases has to use a more complex interface.

Plus\, an apparently-oxymoronic API just fits Perl’s character. :-)

Regards\, -- Aristotle Pagaltzis // \<http​://plasmasturm.org/>