rust-lang / cargo

The Rust package manager
https://doc.rust-lang.org/cargo
Apache License 2.0
12.55k stars 2.38k forks source link

Windows device path and long-path meta issue. #9770

Open ehuss opened 3 years ago

ehuss commented 3 years ago

This is a meta issue to coordinate the different issues related to handling device paths and long paths on Windows (such as \\?\ or \\.\). There are several places where Cargo does not handle these well, but it is not clear exactly how they all should be approached. Changes for these require careful consideration, and it's not clear what a general good approach would look like. Some rough thoughts to consider:

Linking issues and PRs:

ChrisDenton commented 3 years ago

Since my PR was linked here, I would add that I'd really love to fix this issues in the standard library so everyone can benefit by default. My thinking at the moment is that Rust should auto-convert to \\?\ style paths whenever a filesystem function is called. Also Display always should use the more natural C:\ style paths (where possible) for showing paths to users.

GetFullPathNameW will be very useful here but it'd also be good to be able to work directly with WTF-8 paths so there's not a need to convert to UTF-16 and back when that's not necessary.

Btw, in case it helps someone, I've started writing about Windows paths which attempts to go in to some detail. It's still a work in progress so sorry if there's any mistakes or anything is unclear.

ehuss commented 3 years ago

Thanks for posting your writeup! I think it would be great to have a resource like that. Microsoft's own documentation is a little scattered and lacking, and having one place clearly describing things would be great. Let me know if you ever want feedback on it.

ChrisDenton commented 3 years ago

Sure, I'd very much welcome feedback! I admit I mostly wrote it for myself which is why it's currently a "secret" gist so I'd appreciate any help in making more useful for others.

dylni commented 3 years ago

@ehuss Is this just waiting on a decision from the Cargo team? I originally created normpath to fix these issues, but the team appears to be less certain now about how these issues should be addressed.

ghost commented 3 years ago

@ChrisDenton Instead of fiddling around with NT paths and unnecessarily making life harder, why not use Embedded Manifests per executable ? Embedded Manifests were designed for a reason.

ChrisDenton commented 3 years ago

@mshaikhcool Enabling the manifest option for long paths would be great! However, it has several limitations which means it doesn't solve all problems. It only works in Windows 10 version 1607 and newer. It requires the user to have admin rights and to change a registry entry. It doesn't fix the issue with broken drivers if you actually do want to resolve symlinks or get an absolute path.

ghost commented 3 years ago

It requires the user to have admin rights and to change a registry entry.

The person who will be installing Rust in the first place will most likely be A Programmer, so this point is moot.
there's a good reason why longPathAware is not enabled by default and why explorer.exedoesn't embed longPathAware in it's manifest. Don't expect it to be enabled by default in a near future.

It only works in Windows 10 version 1607 and newer.

Ah, Classic. Symlinks were introduced in Vista. By that logic, we should not also use symlinks because XP didn't support it.

The point here is, Why are you even bothering in supporting an Out of Extended Life Support OS (like Windows 7) in the first place ? Set a check to see if windows version and the registry key enabled or not. if off eg: in win 7, it should not work.

It doesn't fix the issue with broken drivers if you actually do want to resolve symlinks or get an absolute path.

NT UNC(\\?\) is designed to be used by Subsystems themselves and Drivers, not user mode programs. longPathAware in both Embedded and Side by Side type Manifests are designed to be used by win32 subsystem's user mode programs, not Drivers.

Both has different purposes, Rust should implement both as a system programming lang. Rust should not call undocument apis and should adhere to strict-clean programming principles.

ChrisDenton commented 3 years ago

As I said, using the longPathAware manifest option is great. It helps with a lot problems and should be done when possible.

However, it alone does not fix all the issues here nor in all circumstances. So other solutions need to be explored as well. Nobody is talking about using undocumented APIs. The use of \\?\ style paths is documented for every function that accepts them (e.g. CreateFileW).

ghost commented 3 years ago

I think there's a fundamental misunderstanding here . Use NT UNC (\\?\) for Drivers , longPathAware for win32 subsystem's user mode programs.

Cargo or rustc themselves are user mode programs , they should implement longPathAware either in embedded or side by side type manifest for themselves . and should have support for NT UNC (\\?\) for building Drivers.

I hope it's clear now.

ChrisDenton commented 3 years ago

I would suggest reading Win32 File Namespaces because it feels like we're talking about different things.

ghost commented 3 years ago

can you point me exact win32 apis/scenarios where Manifest file couldn't work but works otherwise ?

ChrisDenton commented 3 years ago

The manifest option is not sufficient to solve all issues listed here. For example:

ghost commented 3 years ago

It does not enable long paths if the user cannot or will not enable long path awareness in the registry (e.g. OS too old, IT policies, etc).

this point is already moot because of Rust's potential usecases. not sure why it's being thrown around each time. Note that : Windows will always be a backward-compatible OS by default. Users must perform changes by themselves to make it forward-compatible.

To use long paths Users must use windows 10 and must enable group policy or registry. Note: this is the Microsoft recommended way. Going against MS's own recommendation does indeed sound like a Desperate Excuse to say "Won't Do".

Even with the manifest option, fs::canonicalize returns \\?\ prefixed paths

this is absolutely horrible implementation. the amount existing softwares break because of UNC, including Microsoft's owns is enough big of a reason to abandon UNC and the "the Excuse" presented above.

there are already proposals for abandoning that in favor of returning win32 absolute paths.

It doesn't fix the issue with broken drivers if you actually do want to resolve symlinks or get an absolute path.

Ah, now realized where the confusion is. this doesn't fix this issue is based on above horrible UNC return implementation which itself is wrong to begin with.

Ditto for user supplied paths which can be in any form

manifest just removes hard coded static buffer size from *W functions. the rest behaviors are unchanged.

(even explorer accepts \\?\ style paths).

Explorer doesn't accept \\?\ paths, it Simply ignores supplied \\?\ prefix.
explorer simply converts this \\?\C:\VeryLong255CharPath\VeryLong255Foo\ to 8.3 Path format c:\VERYLO~1\VERYLO~2 for it to access.

fs::canonicalize can also completely fail with certain RAM drives.

I guess you meant RAM Disk by that. creating RAM Disk requires a KMDF Device Driver, Device Drivers use win32 device paths \\.\ then making symlink to win32 file path \\?\ for user mode applications to access. Manifests Files both Embedded/Fusionor Side By Side (Foo.exe.manifest file) types should work as expected for RAM Disks too.

certain RAM Disk sounds like the software in question's KMDF Driver bug, We shouldn't have to cripple Rust for that.

The manifest is no help here because that's a different problem.

the Manifest files has to do with win32 file paths \\?\ and has nothing to do with win32 device paths \\.\ Most Windows API doesn't take \\.\ device paths as parameters as they already can access devices through symlinked \\?\ paths.

Now with Manifest support , one doesn't even need to attach \\?\ prefix or deal with UNC Path handling complexities anymore.

jessesna commented 2 years ago

NT UNC(\?) is designed to be used by Subsystems themselves and Drivers, not user mode programs.

Hi. Can someone point me to the origin of this statement? Are there any MS docs which can be linked?

Explorer doesn't accept \?\ paths, it Simply ignores supplied \?\ prefix. explorer simply converts this \?\C:\VeryLong255CharPath\VeryLong255Foo\ to 8.3 Path format c:\VERYLO~1\VERYLO~2 for it to access.

Maybe i'm doing it wrong, or has this changed in newer Windows Versions?

image
ChrisDenton commented 2 years ago

Here's a brief guide Windows paths, some of the issues involved and what the standard library has done and is doing to address them. None of this is cargo specific but I hope it helps nonetheless. I'll try to keep this short but I fear I might fail.

Terminology cheat sheet

Path Term
C:\path\to\file Drive path
\\server\share UNC path
\\.\PIPE\name Device path (used for pipes, printers, etc)
\\?\C:\path\to\file
\\?\UNC\server\share
\\?\PIPE\name
Verbatim paths
\??\C:\path\to\file
\Device\HarddiskVolume2\path\to\file
NT kernel paths (not used in Win32 APIs)

NT kernel paths are what both verbatim and the non-verbatim paths end up as but aren't otherwise usable in most user space APIs. So when I say "non-verbatim paths" I mean the first three paths in the table and not including kernel paths.

Verbatim paths

Verbatim paths are passed almost directly to the kernel (except \\?\ is changed to \??\). These are always absolute and don't contain . or .. components because those will simply be treated as normal components (e.g. . is a perfectly valid file or directory name according to the kernel, though most filesystem drivers will probably reject it). Also / is not a path separator; in fact everything except \ is not special in any way.

The term "verbatim" is not official terminology but it's the one used by the Rust standard library for lack of an official name.

Non-verbatim paths

Unlike a verbatim path, other paths are subject to limits (such as MAX_PATH, unless a manifest is used) and are parsed in more complex ways. Parsing of non-verbatim paths includes (but is not limited to):

Path Issues

Filesystem issue

std::fs::canonicalize can fail if the root drive's driver does not implement a necessary kernel interface. This is normally not an issue but there is at least one popular RAM drive software that uses such a broken driver and is a reliable source of bug reports (not just for Rust applications).

Rust standard library

Rust's standard library is addressing these issues in a number of ways:

Outstanding issues

The standard library does not provide a public API to convert between verbatim and non-verbatim paths. Currently the best option would be to use a third party crate for this.

The current directory is always limited by MAX_PATH unless a manifest file is used and the user opts in to enabling long path support. This cannot be fixed by the standard library itself because verbatim paths do not work for the get/set current directory APIs (or rather, they technically work but other Windows APIs will get very confused by it).

bjorn3 commented 2 years ago

It does not enable long paths if the user cannot or will not enable long path awareness in the registry (e.g. OS too old, IT policies, etc).

this point is already moot because of Rust's potential usecases. not sure why it's being thrown around each time. Note that : Windows will always be a backward-compatible OS by default. Users must perform changes by themselves to make it forward-compatible.

How is enabling long path awareness when the application manifest enables it but the register doesn't not backwards-compatible? If the apllication itself opts in, why is there an additional system wide opt in necessary for backwards compatibility?

ehuss commented 2 years ago

If the apllication itself opts in, why is there an additional system wide opt in necessary for backwards compatibility?

I think we can only guess, I haven't seen any explanation from Microsoft. I suspect it is because other programs may fail to access those paths. For example, I believe when it was first added, Explorer couldn't handle those long paths. It introduces an environment where various programs would suddenly start breaking in unpleasant ways when interacting with programs that are long-path aware.

It could also be a security issue similar to how symbolic links are restricted.