Closed p5pRT closed 16 years ago
Under Windows (using File::Spec::Win32)\, the call File::Spec->canonpath('a\\..\\..\\b\\c') returns the wrong value: 'b\\c'. It should return '..\\b\\c'.
This bug first appeared in the Perl 5.8.1 release; Perl 5.8.0 does not have this problem.
In Perl 5.8.0\, the behavior of canonpath() for Win32.pm was very similar to that for Unix.pm: no ".." directory entries were removed from the path. For UNIX systems this is appropriate\, as symbolic and hard links may make "a/b/../c" different than "a/c".
In Perl 5.8.1\, a number of multiple-dot transformations were added to the Win32 support: changing "a\\...\\b" into "a\\..\\..\\b"\, removing redundant double-dots\, etc. If all paths handed to canonpath() are absolute paths\, then it works correctly\, but if the path is a relative path\, it may make a mistake when trying to eliminate the ".." path elements -- apparently it sometimes treats a relative path as absolute\, and since '\\..\\a' is the same as 'a' on Windows\, it was changing '..\\a' to 'a' as well in some situations.
Compounding the problem is that canonpath() is used by other functions as well\, so catfile() and catpath() also suffer errors.
Attached is acontext-diff of the changes I have made and tested for the canonpath() function in lib/File/Spec/Win32.pm\, and additional tests for the problems (and fix) in lib/File/Spec/t/Spec.t.
###################################################################
[tomdinger - Tue Feb 24 10:15:24 2004]:
Under Windows (using File::Spec::Win32)\, the call File::Spec->canonpath('a\\..\\..\\b\\c') returns the wrong value: 'b\\c'. It should return '..\\b\\c'.
Hey Ken\, want to have a look at this? There's a patch and everything!
The RT System itself - Status changed from 'new' to 'open'
On Jul 6\, 2005\, at 11:29 AM\, Michael G Schwern via RT wrote:
[tomdinger - Tue Feb 24 10:15:24 2004]:
Under Windows (using File::Spec::Win32)\, the call File::Spec->canonpath('a\\..\\..\\b\\c') returns the wrong value: 'b\\c'. It should return '..\\b\\c'.
Hey Ken\, want to have a look at this? There's a patch and everything!
If Win32 is going to be handled like Unix\, then it should return the input verbatim (".." is not cleaned up). Is there a reason it should be different on Win32?
-Ken
On Wed\, Jul 06\, 2005 at 02:06:19PM -0500\, Ken Williams wrote:
On Jul 6\, 2005\, at 11:29 AM\, Michael G Schwern via RT wrote:
[tomdinger - Tue Feb 24 10:15:24 2004]:
Under Windows (using File::Spec::Win32)\, the call File::Spec->canonpath('a\\..\\..\\b\\c') returns the wrong value: 'b\\c'. It should return '..\\b\\c'.
Hey Ken\, want to have a look at this? There's a patch and everything!
If Win32 is going to be handled like Unix\, then it should return the input verbatim (".." is not cleaned up). Is there a reason it should be different on Win32?
That's not the issue. b\c is right out wrong. Examine the path closer.
a\..\..\b\c
"a\.." cancels out leaving "..\b\c".
Or\, if you don't want to clean up .. it can be a\..\..\b\c. But b\c is clearly wrong.
PS Offhand the reason I can think for cleaning up .. on Win32 is because it tends to not have symlinks to worry about so a\..\a\ should equal a\
-- Michael G Schwern schwern@pobox.com http://www.pobox.com/~schwern Just call me 'Moron Sugar'. http://www.somethingpositive.net/sp05182002.shtml
On Jul 6\, 2005\, at 2:50 PM\, Michael G Schwern wrote:
On Wed\, Jul 06\, 2005 at 02:06:19PM -0500\, Ken Williams wrote:
On Jul 6\, 2005\, at 11:29 AM\, Michael G Schwern via RT wrote:
[tomdinger - Tue Feb 24 10:15:24 2004]:
Under Windows (using File::Spec::Win32)\, the call File::Spec->canonpath('a\\..\\..\\b\\c') returns the wrong value: 'b\\c'. It should return '..\\b\\c'.
Hey Ken\, want to have a look at this? There's a patch and everything!
If Win32 is going to be handled like Unix\, then it should return the input verbatim (".." is not cleaned up). Is there a reason it should be different on Win32?
That's not the issue. b\c is right out wrong. Examine the path closer.
a\\\.\.\\\.\.\\b\\c
"a\.." cancels out leaving "..\b\c".
Right\, I understand the bug\, I'm just trying to get the right fix.
Note that I suggested that it return that particular input string
verbatim.
PS Offhand the reason I can think for cleaning up .. on Win32 is because it tends to not have symlinks to worry about so a\..\a\ should equal a\
"tends not to" isn't a good enough reason in this case. If it treats them differently\, though\, that might be a good enough reason.
-Ken
On Jul 6\, 2005\, at 4:00 PM\, Dinger\, Tom wrote:
To be perfectly honest\, I don't care which way it is "fixed"\, as long as the result still points to the right file.
Of course. That's what I'm asking: is "bar" guaranteed on Windows to be "the right file" when the input is "foo\\..\\bar"? On Unix\, it's not.
I've never disputed the bug. The current behavior is obviously wrong.
-Ken
On approximately 7/6/2005 1:10 PM\, came the following characters from the keyboard of Ken Williams:
On Jul 6\, 2005\, at 2:50 PM\, Michael G Schwern wrote:
That's not the issue. b\c is right out wrong. Examine the path closer.
a\\\.\.\\\.\.\\b\\c
"a\.." cancels out leaving "..\b\c".
Right\, I understand the bug\, I'm just trying to get the right fix. Note that I suggested that it return that particular input string verbatim.
PS Offhand the reason I can think for cleaning up .. on Win32 is because it tends to not have symlinks to worry about so a\..\a\ should equal a\
"tends not to" isn't a good enough reason in this case. If it treats them differently\, though\, that might be a good enough reason.
So how do Windows' "shortcuts" fit into this? They act a lot like softlinks interactively in Windows Explorer (but not\, AFAICT\, from the file system)?
And then the newer versions of NTFS support "reparse points" and "junction points". How do they fit into this? I have no experience with these\, but it sounds like they act a lot more like Unix links.
Hence potentially making a\.. elimination problematical. Which may bea good reason for name canonicalization on Windows to not attempt to remove .. sequences... there is lots of Windows code out there that does do that\, of course\, I've written some of it myself... but as Windows changes\, compatibility to itself is not guaranteed.
Having identified a vast realm of ignorance\, Wolfram is saying that much of this realm lies forever outside the light cone of human knowledge. -- Michael Swaine\, Dr Dobbs Journal\, Sept 2002
To be perfectly honest\, I don't care which way it is "fixed"\, as long as the result still points to the right file.
And now some history: the version of File::Spec::Win32 in Perl 5.8.0 did no ".." processing in canonpath()\, and that was fine. As of Perl 5.8.1\, the canonpath() started doing the broken ".." processing. So\, this patch assumed that the ".." simplification was desired\, and fixed that.
Either roll back that method to the 5.8.0 version\, or apply the patch. The current behavior is the worst of the three choices\, IMHO ...
TD
-----Original Message----- From: Ken Williams To: Michael G Schwern Cc: Tom.Dinger@Scansoft.com; perlbug-followup@perl.org Sent: 7/6/2005 4:10 PM Subject: Re: [perl #27052] File::Spec->canonpath("a\\..\\..\\b") returns wrong value for Win 32
On Jul 6\, 2005\, at 2:50 PM\, Michael G Schwern wrote:
On Wed\, Jul 06\, 2005 at 02:06:19PM -0500\, Ken Williams wrote:
On Jul 6\, 2005\, at 11:29 AM\, Michael G Schwern via RT wrote:
[tomdinger - Tue Feb 24 10:15:24 2004]:
Under Windows (using File::Spec::Win32)\, the call File::Spec->canonpath('a\\..\\..\\b\\c') returns the wrong value: 'b\\c'. It should return '..\\b\\c'.
Hey Ken\, want to have a look at this? There's a patch and everything!
If Win32 is going to be handled like Unix\, then it should return the input verbatim (".." is not cleaned up). Is there a reason it should be different on Win32?
That's not the issue. b\c is right out wrong. Examine the path closer.
a\\\.\.\\\.\.\\b\\c
"a\.." cancels out leaving "..\b\c".
Right\, I understand the bug\, I'm just trying to get the right fix.
Note that I suggested that it return that particular input string
verbatim.
PS Offhand the reason I can think for cleaning up .. on Win32 is because it tends to not have symlinks to worry about so a\..\a\ should equal a\
"tends not to" isn't a good enough reason in this case. If it treats them differently\, though\, that might be a good enough reason.
-Ken
Glenn Linderman wrote:
So how do Windows' "shortcuts" fit into this? They act a lot like softlinks interactively in Windows Explorer (but not\, AFAICT\, from the file system)?
Shortcuts are an abomination; don't try and support them (AFAIK only Explorer and newer versions of Office can deal with them).
And then the newer versions of NTFS support "reparse points" and "junction points". How do they fit into this? I have no experience with these\, but it sounds like they act a lot more like Unix links.
NTFS has had hard links for files for some time\, but only with a developer tool. Junctions are more like soft links\, but only for directories (including remote folders on different machines). Reparse points are filesystem objects that give additional behaviors to files and folders (it is how they implemented junctions actually). None of which has any bearing on software running on top of NTFS\, which has no reason to know that there is anything special about these files (in that sense they are more like hardlinks under *nix).
/These aren't the droids we're looking for. Move along\, move along!/
John
-- John Peacock Director of Information Research and Technology Rowman & Littlefield Publishing Group 4720 Boston Way Lanham\, MD 20706 301-459-3366 x.5010 fax 301-429-5747
On 7/6/05\, Ken Williams \ken@​mathforum\.org wrote:
On Jul 6\, 2005\, at 4:00 PM\, Dinger\, Tom wrote:
To be perfectly honest\, I don't care which way it is "fixed"\, as long as the result still points to the right file.
Of course. That's what I'm asking: is "bar" guaranteed on Windows to be "the right file" when the input is "foo\\..\\bar"? On Unix\, it's not.
I've never disputed the bug. The current behavior is obviously wrong.
Im not sure if this is useful\, but many of the things that File::Spec tries to do on win32 are actually supported directly by the Win32 API. IMO at least some of File::Spec's behaviour could take advantage of this API.
Win32::GetFullPathName() is the one i have in mind when I say this.
perl -e "use Win32; print Win32::GetFullPathName(qq[foo\\..\\bar]);"
outputs "CWD\bar".
So if you strip off the CWD from the result of Win32::GetFullPathName() you get the OS'es solution of this problem\, which should bypass all of these issues. Ie\, crudely:
use Win32; use Cwd qw(cwd);
sub canonpath {
my $path=shift;
$ret=Win32::GetFullPathName($path);
if ($path!~/^\w:/) {
(my $cwd=cwd()."/")=~s!/!\\!g;
while (substr($cwd\,0\,1) eq substr($ret\,0\,1)) {
substr($cwd\,0\,1\,"");
substr($ret\,0\,1\,"");
}
}
return $ret;
}
print canonpath("foo/../../../bar/baz");
BTW\, i say crudely\, because I dont think that canonpath is very well defined. Is it a relative path or not? Should it function differently?
Likewise\, File::Spec::rel2abs() should be rewritten to be a passthrough to Win32::GetFullPathName().
Anyway\, the point is that using Win32::GetFullPathName() is available to resolve a bunch of these issues (at least as far as Win32 goes).
Yves
-- perl -Mre=debug -e "/just|another|perl|hacker/"
On Jul 8\, 2005\, at 8:51 AM\, yves orton via RT wrote:
Im not sure if this is useful\, but many of the things that File::Spec tries to do on win32 are actually supported directly by the Win32 API. IMO at least some of File::Spec's behaviour could take advantage of this API.
Yeah\, very true. However\, the Win32 docs say that GetFullPathName()
first appeared in perl 5.6\, so before that we'd still need to emulate.
Alternatively\, we could add an XS implementation of it for 5.005 on
Windows\, since the underlying C API should be available.
BTW\, i say crudely\, because I dont think that canonpath is very well defined. Is it a relative path or not?
If the input is relative\, the output is relative. If the input is absolute\, the output is absolute.
Likewise\, File::Spec::rel2abs() should be rewritten to be a passthrough to Win32::GetFullPathName().
True. The only problem is that it'll make it harder for me to test\, because I don't test on Windows.
Your basic point is well-taken\, though - we should indeed use Win32::* stuff whenever appropriate.
-Ken
Ken Williams \ken@​mathforum\.org writes:
On Jul 6\, 2005\, at 11:29 AM\, Michael G Schwern via RT wrote:
[tomdinger - Tue Feb 24 10:15:24 2004]:
Under Windows (using File::Spec::Win32)\, the call File::Spec->canonpath('a\\..\\..\\b\\c') returns the wrong value: 'b\\c'. It should return '..\\b\\c'.
Hey Ken\, want to have a look at this? There's a patch and everything!
If Win32 is going to be handled like Unix\, then it should return the input verbatim (".." is not cleaned up). Is there a reason it should be different on Win32?
Without (sym)links Win32 has simpler .. semantics than Unix. But there are still pitfalls crossing "mount" points.
-Ken
On Fri\, Jul 08\, 2005 at 03:50:49PM +0200\, demerphq wrote:
Im not sure if this is useful\, but many of the things that File::Spec tries to do on win32 are actually supported directly by the Win32 API. IMO at least some of File::Spec's behaviour could take advantage of this API.
Win32::GetFullPathName() is the one i have in mind when I say this.
perl -e "use Win32; print Win32::GetFullPathName(qq[foo\\..\\bar]);"
outputs "CWD\bar".
So if you strip off the CWD from the result of Win32::GetFullPathName() you get the OS'es solution of this problem\, which should bypass all of these issues.
Does it? It still leaves us asking the question: can we assume foo\..\bar == bar on Windows? Just because a system call does it that way doesn't mean its right.
Anyhow\, should this discussion drag on any longer without resolution there's a simple yardstick to use: Which retains the most information? Not collapsing .. does. So given that the current implementation is clearly wrong\, and if we can't decide between the two right implementations\, pick the one that's safest to at least give a correct answer. Then we can discuss some more and maybe have switch to the other one.
-- Michael G Schwern schwern@pobox.com http://www.pobox.com/~schwern 'All anyone gets in a mirror is themselves\,' she said. 'But what you gets in a good gumbo is everything.' -- "Witches Abroad" by Terry Prachett
On 7/8/05\, Michael G Schwern \schwern@​pobox\.com wrote:
On Fri\, Jul 08\, 2005 at 03:50:49PM +0200\, demerphq wrote:
Im not sure if this is useful\, but many of the things that File::Spec tries to do on win32 are actually supported directly by the Win32 API. IMO at least some of File::Spec's behaviour could take advantage of this API.
Win32::GetFullPathName() is the one i have in mind when I say this.
perl -e "use Win32; print Win32::GetFullPathName(qq[foo\\..\\bar]);"
outputs "CWD\bar".
So if you strip off the CWD from the result of Win32::GetFullPathName() you get the OS'es solution of this problem\, which should bypass all of these issues.
Does it? It still leaves us asking the question: can we assume foo\..\bar == bar on Windows? Just because a system call does it that way doesn't mean its right.
Well\, i suppose you are correct. Im not entirely sure what scenario I should be testing here\, but i beleive the problem you are thinking of is due to symlinks to a directory? If so then the win32 equivelent would be a junction I think and in that case yes\, foo\..\bar == bar.
D:\dev\junct>junction foo
Junction v1.03 - Win2K junction creator and reparse point viewer Copyright (C) 2000-2002 Mark Russinovich Systems Internals - http://www.sysinternals.com
D:\dev\junct\foo: JUNCTION Substitute Name: d:\dev\test
D:\dev\junct>echo Test1 > foo\..\test.echo
D:\dev\junct>type test.echo Test1
D:\dev\junct>cd ..
D:\dev>echo test2 > test\..\test.echo
D:\dev>type test.echo test2
Anyhow\, should this discussion drag on any longer without resolution there's a simple yardstick to use: Which retains the most information? Not collapsing .. does. So given that the current implementation is clearly wrong\, and if we can't decide between the two right implementations\, pick the one that's safest to at least give a correct answer. Then we can discuss some more and maybe have switch to the other one.
Sure. My comment was mostly that by using the Win32 API's one can do this type of stuff more reliably than with File::Spec. With the emphasis intended to be more on "this type of stuff" than on the particular problem that lead to this thread.
To be honest i would really like to see the expected behaviour of canonpath when called on a relative path explicitly defined. IMO canonpath should act more like rel2abs\, insofar as it should support an optional $base argument to use instead of CWD when trying to clean up a relative path.
-- perl -Mre=debug -e "/just|another|perl|hacker/"
On Fri\, Jul 08\, 2005 at 11:07:22PM +0200\, demerphq wrote:
should be testing here\, but i beleive the problem you are thinking of is due to symlinks to a directory? If so then the win32 equivelent would be a junction I think and in that case yes\, foo\..\bar == bar.
I have the creeping feeling that there's an argument to be made here\, but I never fully understood the symlinks vs .. cannonpath argument so I'll hope someone else jumps in and makes it.
To be honest i would really like to see the expected behaviour of canonpath when called on a relative path explicitly defined. IMO canonpath should act more like rel2abs\, insofar as it should support an optional $base argument to use instead of CWD when trying to clean up a relative path.
canonpath() should never be inserting the CWD when cleaning up. "./bar" is the same as "bar" but "$CWD/bar" is not! Its important that cannonical relative paths remain relative.
-- Michael G Schwern schwern@pobox.com http://www.pobox.com/~schwern 'All anyone gets in a mirror is themselves\,' she said. 'But what you gets in a good gumbo is everything.' -- "Witches Abroad" by Terry Prachett
On 7/8/05\, Michael G Schwern \schwern@​pobox\.com wrote:
On Fri\, Jul 08\, 2005 at 11:07:22PM +0200\, demerphq wrote:
should be testing here\, but i beleive the problem you are thinking of is due to symlinks to a directory? If so then the win32 equivelent would be a junction I think and in that case yes\, foo\..\bar == bar.
I have the creeping feeling that there's an argument to be made here\, but I never fully understood the symlinks vs .. cannonpath argument so I'll hope someone else jumps in and makes it.
Me too. Not being all that familiar with *nix file systems I made my best stab at an answer as it seems to pertain to Win32. I think John Peacock also said something to this effect.
To be honest i would really like to see the expected behaviour of canonpath when called on a relative path explicitly defined. IMO canonpath should act more like rel2abs\, insofar as it should support an optional $base argument to use instead of CWD when trying to clean up a relative path.
canonpath() should never be inserting the CWD when cleaning up. "./bar" is the same as "bar" but "$CWD/bar" is not! Its important that cannonical relative paths remain relative.
Sorry\, i guess I didnt express myself properly. You cant clean up a relative path properly without knowing where it is relative to. Consider the following path:
..\..\foo
If we are in \bar then ..\..\foo is the same as ..\foo and \foo but if we are in \bar\baz\bat then its not the same as either as it maps to \bar\foo.
yves
-- perl -Mre=debug -e "/just|another|perl|hacker/"
On approximately 7/8/2005 1:53 PM\, came the following characters from the keyboard of Michael G Schwern:
On Fri\, Jul 08\, 2005 at 03:50:49PM +0200\, demerphq wrote:
Im not sure if this is useful\, but many of the things that File::Spec tries to do on win32 are actually supported directly by the Win32 API. IMO at least some of File::Spec's behaviour could take advantage of this API.
Win32::GetFullPathName() is the one i have in mind when I say this.
perl -e "use Win32; print Win32::GetFullPathName(qq[foo\\..\\bar]);"
outputs "CWD\bar".
So if you strip off the CWD from the result of Win32::GetFullPathName() you get the OS'es solution of this problem\, which should bypass all of these issues.
Does it? It still leaves us asking the question: can we assume foo\..\bar == bar on Windows? Just because a system call does it that way doesn't mean its right.
In the presence of junction points\, I think not. The system call would have the opportunity to understand and check for the existance of junction points\, but whether it does or not\, is anyone's guess\, without trying it.
Anyhow\, should this discussion drag on any longer without resolution there's a simple yardstick to use: Which retains the most information? Not collapsing .. does. So given that the current implementation is clearly wrong\, and if we can't decide between the two right implementations\, pick the one that's safest to at least give a correct answer. Then we can discuss some more and maybe have switch to the other one.
Having identified a vast realm of ignorance\, Wolfram is saying that much of this realm lies forever outside the light cone of human knowledge. -- Michael Swaine\, Dr Dobbs Journal\, Sept 2002
On 7/8/05\, Glenn Linderman \perl@​nevcal\.com wrote:
On approximately 7/8/2005 1:53 PM\, came the following characters from the keyboard of Michael G Schwern:
On Fri\, Jul 08\, 2005 at 03:50:49PM +0200\, demerphq wrote:
Im not sure if this is useful\, but many of the things that File::Spec tries to do on win32 are actually supported directly by the Win32 API. IMO at least some of File::Spec's behaviour could take advantage of this API.
Win32::GetFullPathName() is the one i have in mind when I say this.
perl -e "use Win32; print Win32::GetFullPathName(qq[foo\\..\\bar]);"
outputs "CWD\bar".
So if you strip off the CWD from the result of Win32::GetFullPathName() you get the OS'es solution of this problem\, which should bypass all of these issues.
Does it? It still leaves us asking the question: can we assume foo\..\bar == bar on Windows? Just because a system call does it that way doesn't mean its right.
In the presence of junction points\, I think not. The system call would have the opportunity to understand and check for the existance of junction points\, but whether it does or not\, is anyone's guess\, without trying it.
Doesn't look like there is a problem with this. I guess ".." is resolved by the OS relative to "."\, and not a hard link to a specific directory. (But that is pure speculation.)
D:\dev\junct\foo>perl -MWin32 -e"print Win32::GetFullPathName('.');" D:\dev\junct\foo D:\dev\junct\foo>cd ..\..\test
D:\dev\test>perl -MWin32 -e"print Win32::GetFullPathName('.');" D:\dev\test D:\dev\test>cd ..
D:\dev>junction junct\foo
Junction v1.03 - Win2K junction creator and reparse point viewer Copyright (C) 2000-2002 Mark Russinovich Systems Internals - http://www.sysinternals.com
D:\dev\junct\foo: JUNCTION Substitute Name: d:\dev\test
-- perl -Mre=debug -e "/just|another|perl|hacker/"
On approximately 7/8/2005 2:07 PM\, came the following characters from the keyboard of demerphq:
On 7/8/05\, Michael G Schwern \schwern@​pobox\.com wrote:
On Fri\, Jul 08\, 2005 at 03:50:49PM +0200\, demerphq wrote:
Im not sure if this is useful\, but many of the things that File::Spec tries to do on win32 are actually supported directly by the Win32 API. IMO at least some of File::Spec's behaviour could take advantage of this API.
Win32::GetFullPathName() is the one i have in mind when I say this.
perl -e "use Win32; print Win32::GetFullPathName(qq[foo\\..\\bar]);"
outputs "CWD\bar".
So if you strip off the CWD from the result of Win32::GetFullPathName() you get the OS'es solution of this problem\, which should bypass all of these issues.
Does it? It still leaves us asking the question: can we assume foo\..\bar == bar on Windows? Just because a system call does it that way doesn't mean its right.
Well\, i suppose you are correct. Im not entirely sure what scenario I should be testing here\, but i beleive the problem you are thinking of is due to symlinks to a directory? If so then the win32 equivelent would be a junction I think and in that case yes\, foo\..\bar == bar.
D:\dev\junct>junction foo
Junction v1.03 - Win2K junction creator and reparse point viewer Copyright (C) 2000-2002 Mark Russinovich Systems Internals - http://www.sysinternals.com
D:\dev\junct\foo: JUNCTION Substitute Name: d:\dev\test
D:\dev\junct>echo Test1 > foo\..\test.echo
D:\dev\junct>type test.echo Test1
D:\dev\junct>cd ..
D:\dev>echo test2 > test\..\test.echo
D:\dev>type test.echo test2
OK\, you have just proven that the file is the same whether accessed by one name or the other. Now that you have the junction set up\, how about reporting on the results of:
d: md \dev\junct\bar cd \dev\junct cd foo\..\bar cd .. cd foo cd .. cd \dev\test cd foo\..\bar cd .. cd foo cd ..
showing all the prompts that display the current path name in the sequence?
And even then\, one would have to test the function of GetFullPathName separately\, but if a discrepancy shows up in the above test\, then foo\..\bar != bar all the time.
Anyhow\, should this discussion drag on any longer without resolution there's a simple yardstick to use: Which retains the most information? Not collapsing .. does.
Actually\, if you knew that GetFullPathName was sensitive to and properly handled junction points\, and it collapsed the name\, you would get more information that way.
So given that the current implementation is clearly wrong\, and if we can't decide between the two right implementations\, pick the one that's safest to at least give a correct answer. Then we can discuss some more and maybe have switch to the other one.
Sure. My comment was mostly that by using the Win32 API's one can do this type of stuff more reliably than with File::Spec. With the emphasis intended to be more on "this type of stuff" than on the particular problem that lead to this thread.
To be honest i would really like to see the expected behaviour of canonpath when called on a relative path explicitly defined. IMO canonpath should act more like rel2abs\, insofar as it should support an optional $base argument to use instead of CWD when trying to clean up a relative path.
Having identified a vast realm of ignorance\, Wolfram is saying that much of this realm lies forever outside the light cone of human knowledge. -- Michael Swaine\, Dr Dobbs Journal\, Sept 2002
On Jul 8\, 2005\, at 4:26 PM\, Michael G Schwern wrote:
On Fri\, Jul 08\, 2005 at 11:07:22PM +0200\, demerphq wrote:
should be testing here\, but i beleive the problem you are thinking of is due to symlinks to a directory? If so then the win32 equivelent would be a junction I think and in that case yes\, foo\..\bar == bar.
I have the creeping feeling that there's an argument to be made here\, but I never fully understood the symlinks vs .. cannonpath argument so I'll hope someone else jumps in and makes it.
It's been in the File::Spec docs for a couple versions now\, because it's widely misunderstood:
http://search.cpan.org/~kwilliams/PathTools-3.09/lib/File/Spec/Unix.pm
-Ken
On Fri\, Jul 08\, 2005 at 11:49:59PM +0200\, demerphq wrote:
canonpath() should never be inserting the CWD when cleaning up. "./bar" is the same as "bar" but "$CWD/bar" is not! Its important that cannonical relative paths remain relative.
Sorry\, i guess I didnt express myself properly. You cant clean up a relative path properly without knowing where it is relative to. Consider the following path:
..\..\foo
If we are in \bar then ..\..\foo is the same as ..\foo and \foo but if we are in \bar\baz\bat then its not the same as either as it maps to \bar\foo.
Reason #982983 to leave .. alone.
Your proposed "cleanup" is actually changing the meaning of the path. ..\..\foo means "go up two levels and then down into foo". ..\foo means "go up one level and then down into foo". They represent different things. canonpath() is about producing filepaths with the same meaning and NOT about determining if they might point to the same file given the current CWD. For that you can just run it through rel2abs().
Would would be nice is some sort of collapse() method which does collapse .. so that canonpath("foo/../bar") == 'foo/../bar' but collapse(canonpath("foo/../bar")) == 'bar'.
I like this solution. canonpath() can remain strict and work the same
across platforms. The user can decide if they want .. collapsed or not.
We stop having to second guess the user's needs.
-- Michael G Schwern schwern@pobox.com http://www.pobox.com/~schwern You are wicked and wrong to have broken inside and peeked at the implementation and then relied upon it. -- tchrist in \31832\.969261130@​chthon
On Jul 8\, 2005\, at 4:50 PM\, yves orton via RT wrote:
Sorry\, i guess I didnt express myself properly. You cant clean up a relative path properly without knowing where it is relative to. Consider the following path:
..\..\foo
If we are in \bar then ..\..\foo is the same as ..\foo and \foo but if we are in \bar\baz\bat then its not the same as either as it maps to \bar\foo.
The current working directory is not considered in canonpath()\, period. canonpath("../../foo") is "../../foo".
-Ken
Ok\, enough dithering. Let's squash this.
Attached is a patch which fixes this bug so canonpath("a\\..\\..\\b\\c") returns "..\\b\\c". The question of whether or not canonpath() should collapse .. is out of the scope of this bug as it currently contains quite explicit code and comments to do so.
The patch replaces the collapse code in Win32->canonpath() with Unix->_collapse() (I put it in Unix because its platform independent) that does a better job. It also neatly encapsulates the tricky collapsing code from the tricky canonpath code.
On 7/9/05\, Ken Williams \ken@​mathforum\.org wrote:
On Jul 8\, 2005\, at 4:50 PM\, yves orton via RT wrote:
Sorry\, i guess I didnt express myself properly. You cant clean up a relative path properly without knowing where it is relative to. Consider the following path:
..\..\foo
If we are in \bar then ..\..\foo is the same as ..\foo and \foo but if we are in \bar\baz\bat then its not the same as either as it maps to \bar\foo.
The current working directory is not considered in canonpath()\, period. canonpath("../../foo") is "../../foo".
As I said before\, the docs DONT specify what canonpath() is for very well. All of this stuff is inferred or implied. Personally i dont feel bad in coming up with a interpretation of what canonpath() is for that differs from your own\, or Schwerns\, when the docs do not actually explicitly say what it does.
One can infer almost any behaviour from "cleans up the path". And IMO\, whether looking at cwd or not is included in "does not look at the filesystem" is unclear.
Im happy with the behaviour you outline\, and i think i like Schwern's idea of a collapse path or whatever. But id like to see this stuff spelled out in the File::Spec docs properly.
OTOH\, i still think canonpath for absolute paths on Win32 should use GetFullPathName().
yves -- perl -Mre=debug -e "/just|another|perl|hacker/"
On 7/9/05\, Glenn Linderman \perl@​nevcal\.com wrote:
On approximately 7/8/2005 2:07 PM\, came the following characters from the keyboard of demerphq:
On 7/8/05\, Michael G Schwern \schwern@​pobox\.com wrote:
On Fri\, Jul 08\, 2005 at 03:50:49PM +0200\, demerphq wrote:
Im not sure if this is useful\, but many of the things that File::Spec tries to do on win32 are actually supported directly by the Win32 API. IMO at least some of File::Spec's behaviour could take advantage of this API.
Win32::GetFullPathName() is the one i have in mind when I say this.
perl -e "use Win32; print Win32::GetFullPathName(qq[foo\\..\\bar]);"
outputs "CWD\bar".
So if you strip off the CWD from the result of Win32::GetFullPathName() you get the OS'es solution of this problem\, which should bypass all of these issues.
Does it? It still leaves us asking the question: can we assume foo\..\bar == bar on Windows? Just because a system call does it that way doesn't mean its right.
Well\, i suppose you are correct. Im not entirely sure what scenario I should be testing here\, but i beleive the problem you are thinking of is due to symlinks to a directory? If so then the win32 equivelent would be a junction I think and in that case yes\, foo\..\bar == bar.
D:\dev\junct>junction foo
Junction v1.03 - Win2K junction creator and reparse point viewer Copyright (C) 2000-2002 Mark Russinovich Systems Internals - http://www.sysinternals.com
D:\dev\junct\foo: JUNCTION Substitute Name: d:\dev\test
D:\dev\junct>echo Test1 > foo\..\test.echo
D:\dev\junct>type test.echo Test1
D:\dev\junct>cd ..
D:\dev>echo test2 > test\..\test.echo
D:\dev>type test.echo test2
OK\, you have just proven that the file is the same whether accessed by one name or the other. Now that you have the junction set up\, how about reporting on the results of:
d: md \dev\junct\bar cd \dev\junct cd foo\..\bar cd .. cd foo cd .. cd \dev\test cd foo\..\bar cd .. cd foo cd ..
showing all the prompts that display the current path name in the sequence?
im not sure if your script exploited the structure i set up properly or not.
I had d:\dev\junct\foo -> d:\dev\test
so creating a directory in junct doesnt create it in test. Heres the full output with me adding directories to resolve the "cant find directory" errors. I ran the script you requested three times as "junct.bat".
D:\dev>junction junct\foo
Junction v1.03 - Win2K junction creator and reparse point viewer Copyright (C) 2000-2002 Mark Russinovich Systems Internals - http://www.sysinternals.com
D:\dev\junct\foo: JUNCTION Substitute Name: d:\dev\test
D:\dev>junct.bat
D:\dev>d:
D:\dev>md \dev\junct\bar
D:\dev>cd \dev\junct
D:\dev\junct>cd foo\..\bar
D:\dev\junct\bar>cd ..
D:\dev\junct>cd foo
D:\dev\junct\foo>cd ..
D:\dev\junct>cd \dev\test
D:\dev\test>cd foo\..\bar The system cannot find the path specified.
D:\dev\test>cd ..
D:\dev>cd foo The system cannot find the path specified.
D:\dev>cd ..
D:\>cd dev\test
D:\dev\test>dir Volume in drive D is Data Volume Serial Number is C836-51DB
Directory of D:\dev\test
2005-07-08 16:24 \
D:\dev\test>md foo
D:\dev\test>cd ..
D:\dev>junct.bat
D:\dev>d:
D:\dev>md \dev\junct\bar A subdirectory or file \dev\junct\bar already exists.
D:\dev>cd \dev\junct
D:\dev\junct>cd foo\..\bar
D:\dev\junct\bar>cd ..
D:\dev\junct>cd foo
D:\dev\junct\foo>cd ..
D:\dev\junct>cd \dev\test
D:\dev\test>cd foo\..\bar The system cannot find the path specified.
D:\dev\test>cd ..
D:\dev>cd foo The system cannot find the path specified.
D:\dev>cd ..
D:\>cd dev\test
D:\dev\test>dir Volume in drive D is Data Volume Serial Number is C836-51DB
Directory of D:\dev\test
2005-07-09 08:35 \
D:\dev\test>cd foo
D:\dev\test\foo>cd ..
D:\dev\test>md bar
D:\dev\test>cd ..
D:\dev>junct.bat
D:\dev>d:
D:\dev>md \dev\junct\bar A subdirectory or file \dev\junct\bar already exists.
D:\dev>cd \dev\junct
D:\dev\junct>cd foo\..\bar
D:\dev\junct\bar>cd ..
D:\dev\junct>cd foo
D:\dev\junct\foo>cd ..
D:\dev\junct>cd \dev\test
D:\dev\test>cd foo\..\bar
D:\dev\test\bar>cd ..
D:\dev\test>cd foo
D:\dev\test\foo>cd ..
D:\dev\test>
-- perl -Mre=debug -e "/just|another|perl|hacker/"
On approximately 7/8/2005 11:40 PM\, came the following characters from the keyboard of demerphq:
On 7/9/05\, Glenn Linderman \perl@​nevcal\.com wrote:
On approximately 7/8/2005 2:07 PM\, came the following characters from the keyboard of demerphq:
On 7/8/05\, Michael G Schwern \schwern@​pobox\.com wrote:
On Fri\, Jul 08\, 2005 at 03:50:49PM +0200\, demerphq wrote:
Im not sure if this is useful\, but many of the things that File::Spec tries to do on win32 are actually supported directly by the Win32 API. IMO at least some of File::Spec's behaviour could take advantage of this API.
Win32::GetFullPathName() is the one i have in mind when I say this.
perl -e "use Win32; print Win32::GetFullPathName(qq[foo\\..\\bar]);"
outputs "CWD\bar".
So if you strip off the CWD from the result of Win32::GetFullPathName() you get the OS'es solution of this problem\, which should bypass all of these issues.
Does it? It still leaves us asking the question: can we assume foo\..\bar == bar on Windows? Just because a system call does it that way doesn't mean its right.
Well\, i suppose you are correct. Im not entirely sure what scenario I should be testing here\, but i beleive the problem you are thinking of is due to symlinks to a directory? If so then the win32 equivelent would be a junction I think and in that case yes\, foo\..\bar == bar.
D:\dev\junct>junction foo
Junction v1.03 - Win2K junction creator and reparse point viewer Copyright (C) 2000-2002 Mark Russinovich Systems Internals - http://www.sysinternals.com
D:\dev\junct\foo: JUNCTION Substitute Name: d:\dev\test
D:\dev\junct>echo Test1 > foo\..\test.echo
D:\dev\junct>type test.echo Test1
D:\dev\junct>cd ..
D:\dev>echo test2 > test\..\test.echo
D:\dev>type test.echo test2
OK\, you have just proven that the file is the same whether accessed by one name or the other. Now that you have the junction set up\, how about reporting on the results of:
d: md \dev\junct\bar cd \dev\junct cd foo\..\bar cd .. cd foo cd .. cd \dev\test cd foo\..\bar cd .. cd foo cd ..
showing all the prompts that display the current path name in the sequence?
im not sure if your script exploited the structure i set up properly or not.
I had d:\dev\junct\foo -> d:\dev\test
Oops. No it probably doesn't. You used foo differently than in my mental image\, and then I confused myself by having two mental images of foo.
Although I think that in spite of my confusion\, and the resulting failure of some of the commands (which you tried to fix\, thanks)\, that the information I desired to obtain is demonstrated by the fragment of your testing that I quote below...
so creating a directory in junct doesnt create it in test. Heres the full output with me adding directories to resolve the "cant find directory" errors. I ran the script you requested three times as "junct.bat".
D:\dev>junction junct\foo
Junction v1.03 - Win2K junction creator and reparse point viewer Copyright (C) 2000-2002 Mark Russinovich Systems Internals - http://www.sysinternals.com
D:\dev\junct\foo: JUNCTION Substitute Name: d:\dev\test
D:\dev>junct.bat
D:\dev>d:
D:\dev>md \dev\junct\bar
D:\dev>cd \dev\junct
D:\dev\junct>cd foo\..\bar
D:\dev\junct\bar>cd ..
D:\dev\junct>cd foo
D:\dev\junct\foo>cd ..
D:\dev\junct>cd \dev\test
At this point the prompt being D:\dev\junct indicates that the previous "cd .." doesn't take one over to D:\dev\test. But\, it appears that CMD's CD command is likely looking at D:\dev\junct\foo and lexically removing one path to return to .. rather than following the file system .. entry to get there...
I guess there would be more interesting tests\, perhaps a hybrid of what you did\, and what I was trying to do...
d: cd \dev echo a1 > junct\test.txt echo b2 > test.txt rem verify they are different files type junct\test.txt type test.txt rem see which one we get type junct\foo\..\test.txt
Having identified a vast realm of ignorance\, Wolfram is saying that much of this realm lies forever outside the light cone of human knowledge. -- Michael Swaine\, Dr Dobbs Journal\, Sept 2002
On Sat\, Jul 09\, 2005 at 08:14:47AM +0200\, demerphq wrote:
The current working directory is not considered in canonpath()\, period. canonpath("../../foo") is "../../foo".
As I said before\, the docs DONT specify what canonpath() is for very well. All of this stuff is inferred or implied. Personally i dont feel bad in coming up with a interpretation of what canonpath() is for that differs from your own\, or Schwerns\, when the docs do not actually explicitly say what it does.
One can infer almost any behaviour from "cleans up the path". And IMO\, whether looking at cwd or not is included in "does not look at the filesystem" is unclear.
Yes\, the docs stink. Patches welcome.
OTOH\, i still think canonpath for absolute paths on Win32 should use GetFullPathName().
A) Why change what works? B) The docs for GetFullPathName() read more like rel2abs() than canonpath().
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/fileio/fs/getfullpathname.asp GetFullPathName merges the name of the current drive and directory with a specified file name to determine the full path and file name of a specified file.
C) Its not specified in the GetFullPathName() that it does the sort of canonicalizing we expect. It might *happen* to do it now but who knows later.
D) File::Spec::Win32 will no longer work on non-Windows platforms making cross-platform development and testing just that more frustrating.
-- Michael G Schwern schwern@pobox.com http://www.pobox.com/~schwern Don't try the paranormal until you know what's normal. -- "Lords and Ladies" by Terry Prachett
On Jul 8\, 2005\, at 4:07 PM\, demerphq wrote:
Im not entirely sure what scenario I should be testing here\, but i beleive the problem you are thinking of is due to symlinks to a directory? If so then the win32 equivelent would be a junction I think and in that case yes\, foo\..\bar == bar.
Could you try a scenario like the one outlined in the canonpath() docs here?
http://search.cpan.org/~kwilliams/PathTools-3.09/lib/File/Spec/Unix.pm
-Ken
Ok\, enough dithering. Let's kill this bug.
File::Spec::Win32->canonpath() currently contains code to collapse .. so whether or not it should continue to do so in the future is outside the scope of this bug. That code is also busted and is the source of this bug.
Attached is a patch to fix this bug. It replaces the collapsing code in canonpath() with a more reliable method. It also moves the code into its own method to simplify canonpath(). _collapse() goes into File::Spec::Unix because it is not platform specific. Tests have been added for this bug in both Win32 and Unix.
On 7/12/05\, Michael G Schwern via RT \perlbug\-followup@​perl\.org wrote:
Ok\, enough dithering. Let's kill this bug.
File::Spec::Win32->canonpath() currently contains code to collapse .. so whether or not it should continue to do so in the future is outside the scope of this bug. That code is also busted and is the source of this bug.
Attached is a patch to fix this bug. It replaces the collapsing code in canonpath() with a more reliable method. It also moves the code into its own method to simplify canonpath(). _collapse() goes into File::Spec::Unix because it is not platform specific. Tests have been added for this bug in both Win32 and Unix.
What's the status of this patch ? applied to File::Spec ? rejected ? updated ? (catching up over there...)
On Jul 19\, 2005\, at 3:06 AM\, Rafael Garcia-Suarez wrote:
On 7/12/05\, Michael G Schwern via RT \perlbug\-followup@​perl\.org wrote:
Ok\, enough dithering. Let's kill this bug.
File::Spec::Win32->canonpath() currently contains code to collapse .. so whether or not it should continue to do so in the future is outside the scope of this bug. That code is also busted and is the source of this bug.
Attached is a patch to fix this bug. It replaces the collapsing code in canonpath() with a more reliable method. It also moves the code into its own method to simplify canonpath(). _collapse() goes into File::Spec::Unix because it is not platform specific. Tests have been added for this bug in both Win32 and Unix.
What's the status of this patch ? applied to File::Spec ? rejected ? updated ? (catching up over there...)
Finally applied.
-Ken
p5p@spam.wizbit.be - Status changed from 'open' to 'resolved'
Migrated from rt.perl.org#27052 (status was 'resolved')
Searchable as RT27052$