rust-lang / cargo

The Rust package manager
https://doc.rust-lang.org/cargo
Apache License 2.0
12.86k stars 2.43k forks source link

Windows `\\?\` verbatim paths break idiomatic use of OUT_DIR and `include!` #13919

Open dtolnay opened 6 months ago

dtolnay commented 6 months ago

Problem

When Cargo is invoked on Windows in either of the following 2 scenarios, and possibly others:

then the following widespread idiom for including build.rs-generated code does not work.

include!(concat!(env!("OUT_DIR"), "/repro.rs"));

Because / is not a directory separator in verbatim paths, this fails to compile with an error like this:

error: couldn't read \\?\D:\a\repro\repro\target\debug\build\repro-59592f29f689633e\out/repro.rs: The filename, directory name, or volume label syntax is incorrect. (os error 123)
 --> src\lib.rs:3:1
  |
3 | include!(concat!(env!("OUT_DIR"), "/repro.rs"));
  | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |
  = note: this error originates in the macro `include` (in Nightly builds, run with -Z macro-backtrace for more info)

Normally in Windows paths / works just as well as \, but from Maximum Path Length Limitation: File I/O functions in the Windows API convert "/" to "\" as part of converting the name to an NT-style name, except when using the "\\?\" prefix as detailed in the following sections.

Steps

The problem can be reproduced in GitHub Actions with the following source code.

# Cargo.toml

[package]
name = "repro"
version = "0.0.0"
// build.rs

fn main() {
    let out_dir = std::env::var_os("OUT_DIR").unwrap();
    let repro = std::path::Path::new(&out_dir).join("repro.rs");
    std::fs::write(repro, "").unwrap();
}
// src/lib.rs

include!(concat!(env!("OUT_DIR"), "/repro.rs"));
# .github/workflows/ci.yml

name: CI
on: [push]
jobs:
  repro:
    runs-on: windows-latest
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@nightly

      - run: pwd
      - run: echo ${{github.workspace}}
      - run: echo \\?\${{github.workspace}}

      - run: cargo check --verbose && cargo clean

      - run: cargo check --manifest-path \\?\${{github.workspace}}\Cargo.toml --verbose
        if: always()

      - run: cargo check --verbose
        working-directory: \\?\${{github.workspace}}
        if: always()

The first cargo check will succeed. The second two will each fail as shown above.

Possible Solution(s)

I hope that someone can make a successful proposal for something like:

include!(concat!(env!("OUT_DIR") / "repro.rs"));

which would work in a cross-platform manner by using the correct path separator for rustc's host OS.

Until then, https://github.com/rust-lang/cargo/pull/13914#issuecomment-2111375012 proposes a possible way that Cargo could work around this issue some of the time.

Internally to Cargo, we exclusively work with regular paths and not verbatim paths. It sounds like the problem is that we accept paths from users in a couple of places which may be verbatim paths.

Should we instead run dunce on those paths so they are no longer verbatim? This would solve it for Cargo and all other cases.

(https://docs.rs/dunce/1/dunce/fn.simplified.html)

This is only a partial fix because not all verbatim paths can be converted to something without \\?\. If a path is longer than 260 characters, \\?\ is the only way to refer to it.

https://github.com/rust-lang/cargo/pull/13914 shows another possible workaround, but that one amounts to boiling the ocean to replace the OUT_DIR/include idiom with a substantially more verbose one.

Notes

Meta-issue: https://github.com/rust-lang/cargo/issues/9770

ChrisDenton commented 6 months ago

If a path is longer than 260 characters, \\?\ is the only way to refer to it.

This is only true when calling OS APIs (and only if longpathaware isn't enabled). Paths can be stored and passed around as a user path then converted to a verbatim path at the point of use. The std will even do the latter for you.

weihanglo commented 6 months ago

This is only true when calling OS APIs (and only if longpathaware isn't enabled). Paths can be stored and passed around as a user path then converted to a verbatim path at the point of use. The std will even do the latter for you.

Since Cargo has set longpathaware and std handles for us, does it mean it's generally safe for cargo to take advantage of non-UNC paths via dunce?

ChrisDenton commented 6 months ago

Hm, I don't think that's used for build scripts? But in any case, when using Rust's std APIs it should work out fine either way.

The issue with dunce is that it intentionally won't work in a number of cases, including for long paths and \\?\UNC\ style paths. Nushell ran into issues with this so I created omnipath which they've been using successfully for awhile now. Essentially the philosophy of omnipath is to have a distinction between "user" paths and "verbatim" paths where converting between the two is always fine so long as it's possible to round-trip (e.g. converting to verbatim->user->verbatim will return the same path). The only downside is that it needs to use OS APIs to be sure it can round-trip the path. So it is slower.

ChrisDenton commented 6 months ago

I would just add that for a long time now PathBuf::push, etc should work fine if pushing like a/b to a verbatim path. It will convert the separator for you. It's just naively appending strings which doesn't work. The problem, as the OP says, is that there isn't a path API for include!.

weihanglo commented 3 months ago

Some other workarouds appear in https://github.com/rust-lang/rust/issues/75075.

golanguage commented 3 weeks ago

Add depend crate gpui when cargo build under pwsh 7.4.6 I got

error: couldn't read \\?\C:\myproj\rust\gui\gpuimg\target\debug\build\typenum-8c302cc9c705de15\out/op.rs: 文件名、目录名或卷标语法不正确。 (os error 123)
  --> C:\rust\cargo\registry\src\rsproxy.cn-0dccff568467c15b\typenum-1.17.0\src\lib.rs:72:5
   |
72 |     include!(concat!(env!("OUT_DIR"), "/op.rs"));
   |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

But under powershell 6 or command prompt, the build run successfully.

ChrisDenton commented 3 weeks ago

@golanguage Could you please try the latest nightly Rust?

golanguage commented 3 weeks ago

@golanguage Could you please try the latest nightly Rust?

Thanks @ChrisDenton. afterrustup overfide set nightly and rustup update to 1.84-nighgly cargo is working for building typenum under pwsh 7.4.6 but panic at later place:

error: failed to run custom build command for `gpui v0.1.0 (C:\proj\rust\zed\crates\gpui)`
  cargo:rerun-if-changed=resources/windows/gpui.rc
  Microsoft (R) Windows (R) Resource Compiler Version 10.0.10011.16384
  Copyright (C) Microsoft Corporation.  All rights reserved.

  fatal error RC1109: error creating \\?\C:\myproj\rust\gui\gpuimg\target\debug\build\gpui-a349b6bdb1d76471\out/gpui.lib

  --- stderr
  thread 'main' panicked at C:\rust\cargo\registry\src\rsproxy.cn-0dccff568467c15b\embed-resource-2.5.0\src\windows_msvc.rs:39:13:
  RC.EXE failed to compile specified resource file

On powershell 6 ,cargois working correctly with 1.84 nightly.

ChrisDenton commented 3 weeks ago

Ok, it seems that the original issue is fixed by https://github.com/rust-lang/rust/pull/125205

I think the issue there with gpui comes via the embed-resource crate. I'll note that it does not use path joining and instead hard-codes / as the path separator, which is wrong for \\?\ paths. E.g. here and here

golanguage commented 3 weeks ago

I submit a new issue in embed-resource , rust-embed-resource/issues/71

Ok, it seems that the original issue is fixed by rust-lang/rust#125205

I think the issue there with gpui comes via the embed-resource crate. I'll note that it does not use path joining and instead hard-codes / as the path separator, which is wrong for \\?\ paths. E.g. here and here

nabijaczleweli commented 3 weeks ago

/ is still not a valid character in a filename, and it's a valid directory separator universally, except in this one specific edge-case on Win32, which you don't distinguish and don't detect. std::path::is_separator() returns true for / and \ on windows always and it doesn't take context. skimming the file that provides it I see a special internal implementation for Components, but this is not available for user code

Cargo needs to apply the same fixup for all the paths it reads from cargo:rustc-link-arg-*= anyway to fix the entire extant corpus that does println!(cargo:rustc-link-arg-bins=$OUT_DIR/whatever) or ...=$(realpath .)/whatever (the API is a string, not a Path object).

The only bug in embed-resource is that we execute rc.exe $OUT_DIR/whatever.lib instead of rc.exe Path::new($OUT_DIR).join("whatever.lib") or rc.exe $OUT_DIR${MAIN_SEPARATOR}whatever.lib in platform-specific code.