Open p5pRT opened 19 years ago
If a directory include <0x5c> as a last byte of a directory name, ActivePerl can't judge whether this directory exists or not in a system by using "-d" (like a following sample code "test.pl").
#!perl
if (-d $ARGV[0]) {
print "$ARGV[0] is directory.\n";
} else {
print "$ARGV[0] is not directory.\n";
}
For example, when there is a directory consists of multi-byte characters.
C:\<0x94><0x5c>
In Japanese Windows environments, these two bytes (<0x94> and <0x5c>) compose a Japanese character.
When I ran a sample code "test.pl"(see above):
C:\> test.pl C:\<0x94><0x5c>
C:\<0x94><0x5c> is not directory
it returns "is not directory" even if C:\<0x94><0x5c> directory exists in a system.
However, when I added a backslash or backslashes as follows:
C:\> test.pl C:\<0x94><0x5c>\
C:\<0x94><0x5c>\ is directory
C:\> test.pl C:\<0x94><0x5c>\\
C:\<0x94><0x5c>\\ is directory
C:\> test.pl C:\<0x94><0x5c>\\\
C:\<0x94><0x5c>\\\ is directory
a sample program can find C:\<0x94><0x5c> directory and returns "is directory".
It seemed that one or more <0x5c> is trancated and it causes a fatal problem on a Japanese named directory that may include <0x5c> as a second byte of a particular character.
Once I submitted this problem in a bugtraq in ActiveState and received a following comment:
Perl doesn't internally use the Unicode API on Windows, so the low level IO routines assume all filenames are using single byte characters only. The 0x5c character is a backslash '\' and Perl assumes that forward and backward slashes are semantically equivalent.
To work around a bug in the stat() function of the C runtime library Perl modifies a trailing backslash to a forward slash (in the win32_stat() function in win32.c).
I doubt this problem can be fixed until Perl internals move to Unicode even for the low level file access layer, and I don't see any activity in this area by the core Perl5 Porters.
Fortunately you have discovered a workaround: You can always append an additional backslash to the filename if you are using the -d test. You could use a forward slash as well because Perl will translate it internally anyways
BTW, this problem affects all filetest operators, not just -d. So while you have a workaround for -d, I see that it will be impossible to apply other file operators like the file size -s to a filename ending with a 0x5c byte. :(
It seemed a reason why "an additional backslash" is needed that a trailing backslash is removed. It seemed to that "dir\subdir\" is converted to "dir\subdir" in library and it cause a problem. On the other hand, trancating two or more backslashes seemed to be work well -- "dir\subdir\" and "dir\subdir\\" is treated as "dir\subdir\".
I guess "dir\subdir\" should be remained as "dir\subdir\" instead of changing to "dir\subdir\".
Can anyone running ActivePerl\, Strawberry Perl\, etc.\, confirm that this is still a problem?
Would it be a problem on non-Windows Perls?
Thank you very much.
-- James E Keenan (jkeenan@cpan.org)
The RT System itself - Status changed from 'new' to 'open'
James E Keenan via RT wrote:
Would it be a problem on non-Windows Perls?
It can't happen anywhere that the directory separator manifests as a distinctive code unit that can't otherwise appear in a pathname\, as is the case on Unix.
-zefram
Migrated from rt.perl.org#32394 (status was 'open')
Searchable as RT32394$