rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
98.23k stars 12.7k forks source link

Bootstrap build fails: found crate `serde_derive` compiled by an incompatible version of rustc #125578

Open RalfJung opened 5 months ago

RalfJung commented 5 months ago

I did a git pull, fixed the submodules, and now I am trying to run ./x.py check. However, that fails:

$ ./x.py check
Building bootstrap
   Compiling build_helper v0.1.0 (/home/r/src/rust/rustc.3/src/tools/build_helper)
error[E0514]: found crate `serde_derive` compiled by an incompatible version of rustc
 --> /home/r/src/rust/rustc.3/src/tools/build_helper/src/metrics.rs:1:5
  |
1 | use serde_derive::{Deserialize, Serialize};
  |     ^^^^^^^^^^^^
  |
  = note: the following crate versions were found:
          crate `serde_derive` compiled by rustc 1.79.0-beta.1 (6b544f5ff 2024-04-28): /home/r/src/rust/rustc.3/build/bootstrap/debug/deps/libserde_derive-970fa0dc5f1e06a7.so
  = help: please recompile that crate using this compiler (rustc 1.79.0-beta.6 (66eb3e404 2024-05-23)) (consider running `cargo clean` first)

[...]

This must b a fairly recent regression, I've never had to manually clean bootstrap before.

Cc @onur-ozkan

onur-ozkan commented 5 months ago

This must b a fairly recent regression, I've never had to manually clean bootstrap before.

So it doesn't happen after x clean/rm -rf build?

RalfJung commented 5 months ago

I removed build/boostrap and now everything works as it should yeah.

bjorn3 commented 5 months ago

https://github.com/rust-lang/rust/pull/125478 updated the bootstrap compiler. Does it reproduce if you do a clean build the commit right before this PR (77d41156551dc52a4d5df228c897acd239eb6254) and then build right after this PR (0b2f194b830264f828f5713327369c1f74b4e933)?

onur-ozkan commented 5 months ago

Presumably bootstrap used the outdated artifacts instead of generating them with the new compiler (https://github.com/rust-lang/rust/pull/125478).

bjorn3 commented 5 months ago

I did expect cargo to clean them up when it sees that the rustc version changed. Maybe something is broken with that?

RalfJung commented 5 months ago

125478 updated the bootstrap compiler. Does it reproduce if you do a clean build the commit right before this PR (77d4115) and then build right after this PR (0b2f194)?

Yes.

I think cargo hashes the version and channel, but they are unchanged for this beta update: 1.79.0, beta. The fact that it's beta.1 vs beta.6 should still usually trigger a rebuild though I would think, it should just use the same filenames... Cc @epage @weihanglo

weihanglo commented 5 months ago

The root cause is likely to be that Cargo's cache for rustc -vV output was wrong. The current approach uses cmd args (including program path) and env vars as the cache key. The key is not able to distinguish bootstrap rustc because the path to bootstrap rustc is always build/<host-triple>/stage0/bin/rustc.

I can think of a heavy-hammer solution: We can additionally compute the digest of rustc executable. Anyone has a lightweight alternative?

RalfJung commented 5 months ago

Is rustc -vV so expensive that a cache is justified?

A cheaper alternative would be to key on the size and mtime of the binary.

bjorn3 commented 5 months ago

Is rustc -vV so expensive that a cache is justified?

For small programs it was at least. A single rustc invocation used to be at least 100ms (for some people almost 300ms) with the rustup wrapper overhead included. Rustup optimizations have since reduced this by a significant amount, but it is still about 4x slower with the rustup wrapper than without for me.

ehuss commented 5 months ago

The key is not able to distinguish bootstrap rustc because the path to bootstrap rustc is always build/<host-triple>/stage0/bin/rustc.

I think there is more nuance here. Normally cargo can distinguish different rustc via the mtime of the executable. However, since https://github.com/rust-lang/rust/pull/123246 (@kobzol FYI), which is new in 1.79, the timestamps are always the same.

This will likely be a problem with every bootstrap bump moving forward.

I'm trying to think if this is going to be a more serious problem for other non-rustup users of cargo. I looked at the linux install.sh, and fortunately it updates the timestamps to the current time. I did not look at the macos or windows installers, can someone look and see if recent versions preserve the tarball timestamps? Can anyone think of other users that fetch and extract the tarball directly? They will likely also be impacted by #123246.

I can think of two possible solutions:

A slightly hacky solution would be to add the file size to cargo's hash.[^1] We can be pretty confident that the size will change, and that is cheap to acquire.

Another possibility is to have bootstrap.py delete the bootstrap directory whenever it extracts a new stage0 (presumably in the download_toolchain function). That will slightly hurt caching if you are switching back and forth between different stage0 versions, but I would expect that to be rare. On the plus side, the bootstrap directory will no longer grow without bounds.

[^1]: I've been on the fence about doing this in general in cargo, since I think it can lead to insidious rebuild problems that are even harder to detect than they are now. However, I think it wouldn't hurt to add it just for the Rustc hash to start with.

Kobzol commented 5 months ago

I think that we should somehow update the timestamp in the source builds, e.g. based on the time of the latest commit at the time of packaging, rather than hardcode the same time for everything. I'll take a look at it.

workingjubilee commented 5 months ago

I have had to wipe my build dir like 8 times while cycling through different PRs, and I've accidentally rebuilt LLVM about 5 times.

onur-ozkan commented 5 months ago

I have had to wipe my build dir like 8 times while cycling through different PRs, and I've accidentally rebuilt LLVM about 5 times.

Right..

I thought it's worth bootstrap to have workaround for this and created #125911 PR. So people don't have to waste their time finding the actual problem and checking issues/zulip.

workingjubilee commented 5 months ago

Yeaaah. Redownloading all the bootstrap compilers and then restarting the build isn't exactly optimal, it just happened to be faster, for me, than "try to think about what the actual issue is".

Not really sure why the LLVM got rebuilt though...

...also, 11.

RalfJung commented 5 months ago

FWIW rm build/bootstrap -rf works fine for me and doesn't redownload anything nor rebuild LLVM.

onur-ozkan commented 1 month ago

Seems like this isn't fixed yet (see https://github.com/rust-lang/rust/pull/129924#issuecomment-2326912828).