Closed Fuuzetsu closed 3 years ago
One thing I found: if I run it with RUSTFLAGS='--emit=llvm-ir'
I can't reproduce it anymore.
@rustbot claim
Well, I was wrong. Still reproducible even with RUSTFLAGS='--emit=llvm-ir'
. Produced *.ll
files are identical though.
Well, I was wrong. Still reproducible even with
RUSTFLAGS='--emit=llvm-ir'
. Produced*.ll
files are identical though.
Consistent with my findings with --emit=llvm-ir
as well.
Update: this IS a LLVM bug, I'm able to reproduce it with just ll input file using mainline LLVM. Filed https://bugs.llvm.org/show_bug.cgi?id=52441 to LLVM.
I have a fix: https://reviews.llvm.org/D113468
Do we want to apply it to llvm-rust or are we going to wait for LLVM review?
@yanok Good job, thank you!
I wonder if rust needs https://reviews.llvm.org/D108968 too.
I wonder if rust needs https://reviews.llvm.org/D108968 too.
Answering my own question: it does.
@yanok Could you also do a PR for https://reviews.llvm.org/D108968?
Sure, will do.
Created #90978
Actually I was too slow, there is already #90954
FWIW, while taking the 1.56.0 source and applying the 2 LLVM reproducibility patches fixes the issues we have with Firefox, the latest nightly doesn't: there's a remaining reproducibility issue on i686 linux. I'll test a patched beta next.
The next beta release should be fixed, per #90938.
(Should go out near 00:00 UTC -- I guess it will be 1.57.0-beta.4
.)
the beta branch + the patches seems to fix it, but I've found something new: depending how llvm is built, the result can vary. Specifically, llvm built with a sysroot exhibits reproducibility issues that are not present when not building with a sysroot. I'll wait for the official build for 1.57.0-beta.4 for a definite answer whether the issue is fixed. One thing is sure, though, there's an additional regression in nightly.
So, as far as Firefox is concerned, everything is fixed in the latest betas. I did find a source of non-determinism that is not fixed, but a) it seems to also affects clang (as in, I also have found one in clang, and I think it's the same root cause) b) it goes away with PGO, so it doesn't actually affect Firefox.
@glandium is there a ticket about this one somewhere? It's great that it doesn't impact Firefox but it may be impacting other rust software
I haven't filed it because the smallest reproducer I have at the moment is "build firefox with these flags and observe how sometimes some functions have an extra mov", and from experience, those don't lead to any action.
I haven't filed it because the smallest reproducer I have at the moment is "build firefox with these flags and observe how sometimes some functions have an extra mov", and from experience, those don't lead to any action.
I see, thanks. It sounds like binary-reproducability thing and ABI should(?) be the same at least.
And this is a regression in new LLVM version (rustc 1.56 and beyond)?
I don't know if this affects rustc, but I narrowed down yet another source of non-determinism in debuginfo, and upstream came up with a patch: https://reviews.llvm.org/D115054
We tried using 1.57.0 and it seems broken still. I'll try to get a repro tomorrow. Should I open a new ticket or will someone reopen this one?
We tried using 1.57.0 and it seems broken still. I'll try to get a repro tomorrow. Should I open a new ticket or will someone reopen this one?
Actually, seems it comes from a derive crate in our dependency tree, similar to what we saw in https://github.com/rust-lang/rust/issues/89904 ; So maybe it's fine.
I attach a tarball with a small reproducer (few dependencies from crates.io, lock file, empty lib.rs).
repro.tar.gz
Now run this using rustc 1.56.0:
I expect the sha256sum to be the same for the libpalette so file but it's often different.
This breaks any binary caches, similarly to what #89904 did but that ticket turned out to be a problem with the crate.
This time it seems problem with
rustc
itself (or rather LLVM: stay tuned): 1.55.0 produces same binary/crate hash on every build.Note that in Cargo.toml there's:
Without this, the issue does not present itself. This perhaps makes it look similar to #89911 or #45397 but I think it's different. As far as I saw, it didn't produce a difference in ordering of symbols in the resulting
.so
though I may be mistaken. Please feel free to close as duplicate if you deem it to be the same issue.I have ran with
RUSTC_LOG=debug
and after careful sifting through few GiB of output, I found the differences seem to start inrustc_codegen_ssa
.I have bisected
rustc
itself and ran it on our original code which exhibited the problem. I started the bisect at common merge point with 1.55.0 though the problem turned out well into the 1.56.0 release:cc @nikic who did the update
Meta
rustc --version --verbose
: