Mock environment testing for bootstrap

tmandry commented 2 years ago

There are lots of config knobs in bootstrap, and it is hard to understand how they interact. Consider two recent examples off the top of my head:

https://github.com/rust-lang/rust/pull/101833#discussion_r984872396
- Broke the flag is_rust_llvm
https://github.com/rust-lang/rust/pull/101072#discussion_r956528035
- Implemented the aforementioned flag
- (Nearly) broke the flag llvm_from_ci
- Caught in review by someone who knew the implementation details of that flag

We should really catch these issues in tests, not review, and avoid relying too much on specific reviewers who know parts of the code.

Since bootstrap does I/O and interacts with external tooling, it is inherently difficult to test. The strategy I know involves mocking out the build environment (enough to make tests run in <~1s). From there you can record what the implementation does and take two approaches for the test:

Assert on specific conditions, like "the flag -D FOO was passed to LLVM cmake" or "foo/bar/baz was moved to foo/bar/quux".
Record everything in a golden file for some "representative" set of configurations. You can update these with --bless. This helps guard against unexpected behavior changes.

I think both are helpful. In fact, combining the two is exactly what we do in ui tests.

cc @jyn514 @Mark-Simulacrum

tmandry commented 2 years ago

Implementation notes: Mocking can be implemented with traits (hopefully trait objects to avoid having so many generics). But for any "reading" from the environment you must specify some mock data that can be used in tests. For example, when reading from a file you must provide mock file contents that are used in tests.

Tip: Virtualizing the filesystem itself is not actually necessary. If it's easier, for example, we could run in a sandbox build directory set up for that test. For each external command we'd specify the expected output files with mock outputs and actually create them on the filesystem, then allow directly manipulating them (move/copy/read/etc.) within bootstrap in the test sandbox. I could see this plugging in nicely to the existing traits we have.

Mark-Simulacrum commented 2 years ago

We already try to test some of this via --dry-run which intends to run really quickly (it actually runs on every invocation before starting the real build, and we verify that the Steps executed are the same between both IIRC).

I'm not sure how much extra is needed beyond that -- certainly it supports configuration. I think in order to test things like passing specific flags, we'd want to be dumping/preserving more state than we do today, but that doesn't seem particularly hard (just needs some plumbing to keep vectors of commands around or something).

jyn514 commented 1 year ago

I'm not sure how much extra is needed beyond that -- certainly it supports configuration. I think in order to test things like passing specific flags, we'd want to be dumping/preserving more state than we do today, but that doesn't seem particularly hard (just needs some plumbing to keep vectors of commands around or something).

I don't think it's that simple - if nothing else, output doesn't support dry_run today, so anything using output is special-casing dry_run and we can't see any command it would normally try to run. and in general I think there are lots of blocks that are wholly skipped when dry_run is enabled because it was simpler.

jyn514 commented 1 year ago

; rg 'if .*dry_run\(' src/bootstrap | wc -l
      68

🙃

jyn514 commented 1 year ago

@oli-obk asked me recently "why is this hard?" and I want to record those answers somewhere. A non-exhaustive things I'd like to test:

Detecting src and out: https://github.com/rust-lang/rust/issues/109120
- Including when run from a different machine
- Including when the current working directory isn't inside the source directory
  - Including when CWD is a subdirectory of another unrelated git repo
- Including when CWD is a subdirectory of the source directory, or inside out
"What if src is read-only?"
A laundry list of things to do with the current git state
- The rust-lang/rust remote is called origin
- the remote is called something that's not origin
- the remote doesn't exist, there's only a remote that points to a fork
- should we download llvm?
  - including if this commit modified LLVM
  - including if this commit modified LLVM and is also running in CI
  - do bootstrap's self tests still pass in all those cases (yes this was a real bug!)
- can we figure out where to download llvm/rustc?
  - including if this is stable/beta
  - including if llvm assertions are enabled
  - including if this is a tier 2 target that only has the version without assertions
- can we figure out wtf happened in https://github.com/rust-lang/rust/pull/105058 such that it passed a bors merge but failed all subsequent commits?
Can we test the NixOS support? Right now we rely on people fixing it themselves after the fact.
The bootstrap invocation itself
- python2 vs python3. we did have a test for that at one point but it regressed in https://github.com/rust-lang/rust/pull/106085 (I guess because it wasn't clear that it was testing bootstrap itself in addition to tidy).
  - the shell scripts
  - the src/tools/x wrapper
  - the rust binary without the python wrapper
As a stretch goal, can we test "taking the output of one bootstrap invocation and using it as the stage0 compiler for another invocation"?
- Including across machines? https://github.com/rust-lang/rust/issues/108914

onur-ozkan commented 1 year ago

Can we test the NixOS support? Right now we rely on people fixing it themselves after the fact.

Also same for multiarch linux distros like Debian and Ubuntu

jyn514 commented 1 year ago

oh yeah I'd love to test "do external tools using rustc_private reliably find the sysroot without hacks"; and all of these things but with download-rustc enabled.

rust-lang / rust

Mock environment testing for bootstrap #102563