rust-lang / miri

An interpreter for Rust's mid-level intermediate representation
Apache License 2.0
4.17k stars 320 forks source link

On Windows, "./miri many-seeds ./miri run" re-builds Miri over and over #3509

Closed RalfJung closed 2 months ago

RalfJung commented 2 months ago

Our many-seeds tests are extremely slow on Windows, taking something like 15s per iteration when it's <1s on other platforms. I think I figured out the underlying cause:

  Trying seed: 1
  $ cargo +miri build --bins --tests --locked --all-features --manifest-path D:\a\miri\miri\Cargo.toml
     Compiling miri v0.1.0 (D:\a\miri\miri)
      Finished `dev` profile [optimized + debuginfo] target(s) in 6.23s
  $ cargo +miri build --bins --tests --locked --all-features --manifest-path D:\a\miri\miri\cargo-miri\Cargo.toml
      Finished `dev` profile [optimized + debuginfo] target(s) in 0.06s
  $ (building Miri sysroot)
  $ cargo +miri run --locked --all-features --manifest-path D:\a\miri\miri\Cargo.toml -- -Zlayout-seed=1 -Zmiri-seed=1 tests/many-seeds/scoped-thread-leak.rs --edition=2021 --sysroot C:\Users\runneradmin\AppData\Local\rust-lang\miri\cache
     Compiling miri v0.1.0 (D:\a\miri\miri)
      Finished `dev` profile [optimized + debuginfo] target(s) in 6.31s
       Running `target\debug\miri.exe -Zlayout-seed=1 -Zmiri-seed=1 tests/many-seeds/scoped-thread-leak.rs --edition=2021 --sysroot 'C:\Users\runneradmin\AppData\Local\rust-lang\miri\cache'`

There are two "Compiling Miri" here, for a single test invocation. This doesn't happen on other targets, e.g. on macOS:

  Trying seed: 1
  $ cargo +miri build --bins --tests --locked --all-features --manifest-path /Users/runner/work/miri/miri/Cargo.toml
      Finished `dev` profile [optimized + debuginfo] target(s) in 0.22s
  $ cargo +miri build --bins --tests --locked --all-features --manifest-path /Users/runner/work/miri/miri/cargo-miri/Cargo.toml
      Finished `dev` profile [optimized + debuginfo] target(s) in 0.02s
  $ (building Miri sysroot)
  $ cargo +miri run --locked --all-features --manifest-path /Users/runner/work/miri/miri/Cargo.toml -- -Zlayout-seed=1 -Zmiri-seed=1 tests/many-seeds/scoped-thread-leak.rs --edition=2021 --sysroot /Users/runner/Library/Caches/org.rust-lang.miri
      Finished `dev` profile [optimized + debuginfo] target(s) in 0.03s
       Running `target/debug/miri -Zlayout-seed=1 -Zmiri-seed=1 tests/many-seeds/scoped-thread-leak.rs --edition=2021 --sysroot /Users/runner/Library/Caches/org.rust-lang.miri`

On Linux and macOS, while builds with and without "--tests" produce separate binaries (as some feature flags in dependencies change), it seems they get cached independently and once both are built, we can run both immediately. But on Windows, these two builds seem to clobber each other's cache.

Of course it's also silly to rebuild the sysroot each time, and if we avoided that we'd avoid switching between two different versions of the binary all the time. We'd still do some switching back and forth though, so it would still be better to have proper caching.

@epage is it expected that on Windows, builds of a binary crate with and without "--tests" re-build the binary each time, while on Unix they get cached properly? Or is there something else going on here? If I read the ls output on my machine correctly, then on Linux there are a bunch of miri-<hash> binaries and target/debug/miri is just a hardlink to the latest one. Windows doesn't have hardlinks so that exact same strategy will not work. And maybe there's a reason that just doing a copy instead of a rebuild does not work either?

ChrisDenton commented 2 months ago

Windows doesn't have hardlinks

Windows does in fact have hardlinks.

RalfJung commented 2 months ago

Yeah I just read fs::hard_link docs and learned about that, and was about to edit my comment but you were too fast. ;)

I think I also found the code in cargo that does this hardlinking, and it doesn't seem to have an exception for Windows. But then why does switching back and forth between two different build configurations lead to a rebuild each time on Windows, but not on Unix? Or is that something that usually doesn't happen, and something odd is going on in CI? I don't have a Windows system to test this on.

epage commented 2 months ago

Would you be able to capture Cargo debug traces of a reproduction case and a non-reproduction case?

RalfJung commented 2 months ago

This happens on CI. So... yes probably as long as the traces are not so long as to blow the CI log limit?

What env var should be set to what to get the most valuable trace here?

@ChrisDenton in case you have the time to test this -- if you fetch the latest miri and try ./miri many-seeds "C:/Program Files/Git/usr/bin/bash" -c "./miri run -v tests/pass/hello.rs", does this rebuild Miri each time around or not? Starting with the 2nd seed this should take basically no time, but on Windows CI it does take a lot of time.

epage commented 2 months ago

Likely

$ CARGO_LOG=cargo::core::compiler::fingerprint=trace cargo <cmd>

will work. Otherwise, we'll need to look at everything which will likely flood the logs

RalfJung commented 2 months ago
RalfJung commented 2 months ago

If I just run this with -v but not the full logs, I get on Windows

$ cargo +miri -v run --locked --all-features --manifest-path D:\a\miri\miri\Cargo.toml -- -Zlayout-seed=1 -Zmiri-seed=1 tests/many-seeds/tls-leak.rs --edition=2021 --sysroot C:\Users\runneradmin\AppData\Local\rust-lang\miri\cache
       Fresh autocfg v1.1.0
       Fresh cfg-if v1.0.0
       Fresh regex-syntax v0.8.2
       Fresh windows_i686_msvc v0.48.5
       Fresh version_check v0.9.4
       Fresh windows-targets v0.48.5
       Fresh regex-automata v0.4.5
       Fresh phf_generator v0.11.2
       Fresh regex v1.10.3
       Fresh typenum v1.17.0
       Fresh generic-array v0.14.7
       Fresh parse-zoneinfo v0.3.0
       Fresh phf_codegen v0.11.2
       Fresh chrono-tz-build v0.3.0
       Fresh windows_i686_msvc v0.52.3
       Fresh getrandom v0.2.12
       Fresh smallvec v1.13.1
       Fresh scopeguard v1.2.0
       Fresh siphasher v0.3.11
       Fresh lock_api v0.4.11
       Fresh rand_core v0.6.4
       Fresh windows-targets v0.52.3
       Fresh phf_shared v0.11.2
       Fresh parking_lot_core v0.9.9
       Fresh num-traits v0.2.18
       Fresh crypto-common v0.1.6
       Fresh inout v0.1.3
       Fresh windows-sys v0.48.0
       Fresh ppv-lite86 v0.2.17
       Fresh option-ext v0.2.0
       Fresh rand_chacha v0.3.1
       Fresh cipher v0.4.4
       Fresh phf v0.11.2
       Fresh dirs-sys v0.4.1
       Fresh parking_lot v0.12.1
       Fresh chrono v0.4.38
       Fresh windows-sys v0.52.0
       Fresh rustc-hash v1.1.0
       Fresh log v0.4.20
       Fresh cpufeatures v0.2.12
       Fresh measureme v11.0.1
       Fresh ctrlc v3.4.2
       Fresh aes v0.8.4
       Fresh chrono-tz v0.9.0
       Fresh directories v5.0.1
       Fresh rand v0.8.5
       Dirty miri v0.1.0 (D:\a\miri\miri): dependency info changed
   Compiling miri v0.1.0 (D:\a\miri\miri)
     Running `rustc --crate-name miri --edition=2021 'src\bin\miri.rs' --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --crate-type bin --emit=dep-info,link -C opt-level=2 -C embed-bitcode=no -C debuginfo=2 -C debug-assertions=on --cfg 'feature="default"' --cfg 'feature="stack-cache"' --cfg 'feature="stack-cache-consistency-check"' -C metadata=214354fa613d36d0 --out-dir 'D:\a\miri\miri\target\debug\deps' -L 'dependency=D:\a\miri\miri\target\debug\deps' --extern 'aes=D:\a\miri\miri\target\debug\deps\libaes-d2685e6032bdeb1e.rlib' --extern 'chrono=D:\a\miri\miri\target\debug\deps\libchrono-278ef10d8a07e865.rlib' --extern 'chrono_tz=D:\a\miri\miri\target\debug\deps\libchrono_tz-5eb0d76c6b9f9e64.rlib' --extern 'ctrlc=D:\a\miri\miri\target\debug\deps\libctrlc-ca17dea509dd1e03.rlib' --extern 'directories=D:\a\miri\miri\target\debug\deps\libdirectories-9812c337939ed69a.rlib' --extern 'getrandom=D:\a\miri\miri\target\debug\deps\libgetrandom-9d3dbad1f6102f6c.rlib' --extern 'measureme=D:\a\
    Finished `dev` profile [optimized + debuginfo] target(s) in 6.17s

Not sure what "dependency info changed" means.

RalfJung commented 2 months ago

Also FWIW this will soon stop affecting Miri as we'll switch many-seeds to an entirely different approach, so the bulk of cargo invocations will all have the exact same flags -- no more back-and-forth between cargo run and cargo build --bins --tests.

RalfJung commented 2 months ago

Specifically, as can be seen in this log, now on Windows we just have unnecessary builds (wasting 12s) once per many-seeds test instead of once per seed we try.

RalfJung commented 2 months ago

The verbose log has

2024-05-04T06:41:27.4318113Z    0.102008300s  INFO prepare_target{force=false package_id=miri v0.1.0 (D:\a\miri\miri) target="miri"}: cargo::core::compiler::fingerprint: fingerprint dirty for miri v0.1.0 (D:\a\miri\miri)/Build/TargetInner { name: "miri", tested: false, doc: true, ..: with_path("D:\\a\\miri\\miri\\src\\bin\\miri.rs", Edition2021) }
2024-05-04T06:41:27.4320415Z    0.102041200s  INFO prepare_target{force=false package_id=miri v0.1.0 (D:\a\miri\miri) target="miri"}: cargo::core::compiler::fingerprint:     dirty: UnitDependencyInfoChanged { old_name: "ctrlc", old_fingerprint: 6202229392602597251, new_name: "ctrlc", new_fingerprint: 4374262352119191711 }

(Would be nice if the "dependency info changed" that one can easily get with a -v build would mention the crate name, ctrlc. :)

ctrlc has some Windows-specific dependencies...

[target.'cfg(windows)'.dependencies]
windows-sys = { version = "0.52", features = ["Win32_Foundation", "Win32_System_Threading", "Win32_Security", "Win32_System_Console"] }

but shouldn't cargo just cache a build of that crate with two different feature sets and then not do rebuilds any more?

ChrisDenton commented 2 months ago

I don't really have time to debug this atm but I can confirm I'm seeing the same issue as above on my local system. Looks like a cargo issue? Or maybe something to do with how miri's cache works? I'm not very familiar with the workings of either.

epage commented 2 months ago

but shouldn't cargo just cache a build of that crate with two different feature sets and then not do rebuilds any more?

The relevant docs are at https://doc.rust-lang.org/nightly/nightly-rustc/cargo/core/compiler/fingerprint/index.html#fingerprints-and-metadata

RalfJung commented 2 months ago

Hm, not sure how those docs should help interpret the UnitDependencyInfoChanged { old_name: "ctrlc", old_fingerprint: 6202229392602597251, new_name: "ctrlc", new_fingerprint: 4374262352119191711 }.