Open Bodigrim opened 1 year ago
Related: https://github.com/haskell/filepath/issues/92
isValid
is a hot mess on windows.
I'm not sure how much improvement we can drive here with ad-hoc bugfixes.
The underlying problem is that we're not parsing windows filepaths, although there are pieces that allow us to put together a proper grammar:
With that we could implement a more meaningful version of isValid
.
I'm not sure how much improvement we can drive here with ad-hoc bugfixes.
I agree. My bigger concern is that while at least in theory isValid
could be made correct, makeValid
is fundamentally broken on Windows. It's not like you can meaningfully repair any Windows path at all. Even current behaviour makeValid "test*" == "test_"
is a bit of WAAAAT? Maybe mark it as deprecated?..
Ok, so things are a little more complicated on windows wrt "\\\\?\\UNC\\"
.
These are not statically assigned special names afaiu. Instead those are some form of object symlinks that are maintained inside of windows (and can be viewed in the WinObj browser tool). Also see: https://learn.microsoft.com/en-us/windows/win32/fileio/naming-a-file#nt-namespaces
There are many more, e.g. look at:
\\?\UNC\localhost\c$\foo\bar -> \\localhost\c$\foo\bar
\\?\GLOBALROOT\GLOBAL??\UNC\localhost\c$\foo\bar -> \\localhost\c$\foo\bar
\\?\HarddiskVolume2\foo\bar -> C:\foo\bar (if HarddiskVolume2 is C:)
\\?\GLOBALROOT\GLOBAL??\HarddiskVolume2\foo\bar -> C:\foo\bar (if HarddiskVolume2 is C:)
\\?\GLOBALROOT\Device\Harddisk0\Partition2\foo\bar -> C:\foo\bar (if Harddisk0\Partition2 is C:)
(all the above are somewhat equal)
The fact that filepath
as a library treats \\\\?\\UNC\\
special is in my opinion more of a wart than a feature. I don't consider \\\\?\\UNC\\
a special case in my grammar. The meaning of those object links can only fully be understood when performing IO. Some of them may be somewhat conventional, but still...
Maybe @Mistuke has another opinion.
AFAIU https://learn.microsoft.com/en-us/dotnet/standard/io/file-path-formats#dos-device-paths, \\?\UNC\
is a special case. Namely, Windows filenames can be:
C:\foo\bar
\\server\share\file
.\\.\
, followed by resource name. Now there is a bit of confusion. If you want to format a traditional DOS path as a device path, you can just append \\.\
to C:\foo\bar
, obtaining \\.\C:\foo\bar
. The same does not apply for UNC paths to shared drives, because you end up with \\.\\server\share\file
and device paths are not supposed to contain \\
anywhere except the beginning. To overcome this restriction Windows introduces a workaround: instead of \\.\\server\share\file
you are supposed to write \\.\UNC\server\share\file
. So this is a special syntax.
So this is a special syntax.
It's not syntax, those are simply symbolic links. Again, there's also \\?\GLOBALROOT\GLOBAL??\UNC
...why we don't support that form? We can even do \\?\\GLOBALROOT\Device\Mup\localhost\c$\foo\bar
.
The fact that filepath as a library treats \\?\UNC\ special is in my opinion more of a wart than a feature. I don't consider \\?\UNC\ a special case in my grammar. The meaning of those object links can only fully be understood when performing IO. Some of them may be somewhat conventional, but still...
FWIW I agree, Inside GHC's handling we only really treat \\?\
and \\.\
as special.
I think this is wrong:
\\?\UNC\
is incomplete, it is nether file nor folder name.https://github.com/haskell/filepath/blob/98f8bba9eac8c7183143d290d319be7df76c258b/System/FilePath/Internal.hs#L1065-L1067
If we are in agreement that
isValid
should returnFalse
on this input, there is a harder question ahead. What should be the output ofmakeValid
? Something like\\?\UNC\_\_
?