Closed jonathanpallant closed 4 weeks ago
it regressed both with and without --release.
I also ran a bisect:
searched nightlies: from nightly-2024-07-01 to nightly-2024-10-02 regressed nightly: nightly-2024-08-01 searched commit range: https://github.com/rust-lang/rust/compare/f8060d282d42770fadd73905e3eefb85660d3278...28a58f2fa7f0c46b8fab8237c02471a915924fe5 regressed commit: https://github.com/rust-lang/rust/commit/0b5eb7ba7bd796fb39c8bb6acd9ef6c140f28b65
WG-prioritization assigning priority (Zulip discussion).
I can also reproduce on current beta so tagging accordingly
@rustbot label -I-prioritize +P-medium +regression-from-stable-to-beta
minimized, zero deps -> https://github.com/Dirbaio/defmt-repro
[dirbaio@mars defmt-repro]$ rustc --version
rustc 1.83.0-nightly (06bb8364a 2024-10-01)
[dirbaio@mars defmt-repro]$ cargo build
Compiling radio_app v0.0.0 (/home/dirbaio/defmt-repro)
error: linking with `rust-lld` failed: exit status: 1
|
= note: LC_ALL="C" PATH="/home/dirb(snip)47a" "--gc-sections" "-Tlol.x"
= note: rust-lld: error: lol.x:1: symbol not found: __defmt_default_panic
rust-lld: error: lol.x:1: symbol not found: __defmt_default_panic
rust-lld: error: lol.x:1: symbol not found: __defmt_default_panic
error: could not compile `radio_app` (bin "radio_app") due to 1 previous error
[dirbaio@mars defmt-repro]$ rustc +stable --version
rustc 1.81.0 (eeb90cda1 2024-09-04)
[dirbaio@mars defmt-repro]$ cargo +stable build
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.06s
It seems to all depend on the linker version, not the compiler itself. Using ldd from beta/nightly with a stable compiler triggers this issue. Using ldd from stable on a beta/nightly compiler doesn't.
So I compiled lld from llvm myself, and can confirm: With lld built from commit ebb326a51fec37b5a47e5702e8ea157cd4f835cd
the build of Dirbaio's minimized example fails, whereas with b6dfaf4c291ee186481f6c1dcab03874d931c307
it succeeds.
Specifically, it seems to be this change: https://github.com/llvm/llvm-project/commit/ebb326a51fec37b5a47e5702e8ea157cd4f835cd#diff-efb0f6c06c654ed1044098de2bd75495d2fecd1730054dd26d41f0c9c93996e8R1582
If I revert that change (by simply changeing the if condition to if (false && activeProvideSym)
) the example also succeeds to build again.
This seems to be enough to "fix" this issue:
--- a/lld/ELF/ScriptParser.cpp
+++ b/lld/ELF/ScriptParser.cpp
@@ -1662,7 +1662,7 @@ Expr ScriptParser::readPrimary() {
tok = unquote(tok);
else if (!isValidSymbolName(tok))
setError("malformed number: " + tok);
- if (activeProvideSym)
+ if (false && activeProvideSym)
ctx.script->provideMap[*activeProvideSym].push_back(tok);
else
ctx.script->referencedSymbols.push_back(tok);
I put "fix" in quotes because is not meant as a solution: I didn't even try to understand what the if/else branches are meant to do. So it's quite likely that I broke something else with this change.
(Verified on current LLVM main
branch.)
the next stable release is in 9 days, at which point I believe the Ferrous Systems Embedded Rust exercises will stop compiling.
Upstream issue: https://github.com/llvm/llvm-project/issues/111478.
Is anyone aware of a workaround we could apply for defmt? We would like to avoid defmt being broken on stable with the Rust release next week.
I tried marking the defmt::default_panic
function as pub
, but this does not make a difference.
Is there something we can add to the linker script? Something along the lines of PLEASE_DONT_GARBAGE_COLLECT(__defmt_default_panic)
.
Or can we pretend that this function is in fact used?
Any pointer is much appreciated.
the next stable release is in 9 days, at which point I believe the Ferrous Systems Embedded Rust exercises will stop compiling.
Note that while the exercises are relatively small impact (as they are under our control), defmt
is a widely popular solution in the embedded space.
I have read the related code, maybe we need to add this symbol to the symbol table. Or change the code like this:
diff --git a/lld/ELF/ScriptParser.cpp b/lld/ELF/ScriptParser.cpp
index 3febcfb87da4..44185c282ad7 100644
--- a/lld/ELF/ScriptParser.cpp
+++ b/lld/ELF/ScriptParser.cpp
@@ -1660,10 +1660,11 @@ Expr ScriptParser::readPrimary() {
tok = unquote(tok);
else if (!isValidSymbolName(tok))
setError("malformed number: " + tok);
- if (activeProvideSym)
+ if (activeProvideSym) {
ctx.script->provideMap[*activeProvideSym].push_back(tok);
- else
- ctx.script->referencedSymbols.push_back(tok);
+ return [] { return 0; };
+ }
+ ctx.script->referencedSymbols.push_back(tok);
return [=, s = ctx.script] { return s->getSymbolValue(tok, location); };
}
However, I'm not familiar with this part of the code.
the next stable release is in 9 days, at which point I believe the Ferrous Systems Embedded Rust exercises will stop compiling.
Note that while the exercises are relatively small impact (as they are under our control),
defmt
is a widely popular solution in the embedded space.
In any case, I believe the upstream backport is likely to miss our backport process. I'm reverting this commit to our fork LLVM.
@DianQK said:
In any case, I believe the upstream backport is likely to miss our backport process. I'm reverting this commit to our fork LLVM.
Would this mean this issue will not land on stable? That would be great, then we do not need to find a workaround for defmt.
Note that while the exercises are relatively small impact (as they are under our control),
defmt
is a widely popular solution in the embedded space.
@rustbot label -P-medium +P-high
@DianQK said:
In any case, I believe the upstream backport is likely to miss our backport process. I'm reverting this commit to our fork LLVM.
Would this mean this issue will not land on stable? That would be great, then we do not need to find a workaround for defmt.
Yes! I think so.
Impact of the regression is this breaks all projects that
#[defmt::panic_handler]
defmt::panic!()
or the other macros that panic (assert, todo, unreachable, etc.)A possible workaround defmt could do is adding EXTERN(_defmt_panic)
to the linker script. The side effect is it'd prevent the panic handler from being optimized out when it's truly unused, I think.
A possible workaround defmt could do is adding
EXTERN(_defmt_panic)
to the linker script. The side effect is it'd prevent the panic handler from being optimized out when it's truly unused, I think.
In case the fix (https://github.com/rust-lang/llvm-project/pull/178) does not make it into the next stable I'd think this is an okay compromise to make until it does get fixed. But let's hope it won't be necessary.
@DianQK said:
In any case, I believe the upstream backport is likely to miss our backport process. I'm reverting this commit to our fork LLVM.
Would this mean this issue will not land on stable? That would be great, then we do not need to find a workaround for defmt.
I have submitted PR #131448. I apologize that I may have misspoken. I can't guarantee that this issue won't land on stable, and even reverting this change carries some risk. :)
Impact of the regression is this breaks all projects that
* Use defmt * and do define a `#[defmt::panic_handler]` * and don't ever use `defmt::panic!()` or the other macros that panic (assert, todo, unreachable, etc.)
A possible workaround defmt could do is adding
EXTERN(_defmt_panic)
to the linker script. The side effect is it'd prevent the panic handler from being optimized out when it's truly unused, I think.
Would it be possible to provide any actual project? :3 I think this will be helpful in determining whether to backport.
Would it be possible to provide any actual project? :3 I think this will be helpful in determining whether to backport.
One example is mentioned in the initial report, https://github.com/ferrous-systems/rust-exercises.git
If you think that doesn't count, as it's training material and may not be representative for real-world code, another example would be https://github.com/team-arrow-racing/steering-wheel-controller (I'm not affiliated with that project in any way. Found it with a code search on github.)
$ cargo +beta build
[...]
= note: rust-lld: error: /tmp/steering-wheel-controller/target/thumbv7em-none-eabihf/debug/build/defmt-76d6e60563c62b4a/out/defmt.x:7: symbol not found: __defmt_default_panic
rust-lld: error: /tmp/steering-wheel-controller/target/thumbv7em-none-eabihf/debug/build/defmt-76d6e60563c62b4a/out/defmt.x:7: symbol not found: __defmt_default_panic
rust-lld: error: /tmp/steering-wheel-controller/target/thumbv7em-none-eabihf/debug/build/defmt-76d6e60563c62b4a/out/defmt.x:7: symbol not found: __defmt_default_panic
rust-lld: error: /tmp/steering-wheel-controller/target/thumbv7em-none-eabihf/debug/build/defmt-76d6e60563c62b4a/out/defmt.x:7: symbol not found: __defmt_default_panic
warning: `steering-wheel-controller` (bin "steering-wheel-controller") generated 7 warnings
error: could not compile `steering-wheel-controller` (bin "steering-wheel-controller") due to 1 previous error; 7 warnings emitted
Most concrete binary projects of defmt
are not public, but it has 415 reverse dependencies. https://crates.io/crates/defmt/reverse_dependencies. Multiple of those have about 1mio downloads. It has consistently over 10k daily downloads, showing active usage e.g. in CI/CD pipelines.
It is very widely used in embedded software.
The minimal reproduction is as follows:
main.rs
:
#![no_main]
#![no_std]
#[panic_handler]
fn panic(_: &core::panic::PanicInfo) -> ! {
loop {}
}
#[no_mangle]
fn foo() {}
#[no_mangle]
fn bar() {}
script.t
:
PROVIDE(foo = bar);
$ rustc +beta main.rs --target thumbv7em-none-eabihf -Clink-arg=-Tscript.t
error: linking with `rust-lld` failed: exit status: 1
...
= note: rust-lld: error: script.t:1: symbol not found: bar
rust-lld: error: script.t:1: symbol not found: bar
rust-lld: error: script.t:1: symbol not found: bar
error: aborting due to 1 previous error
With small modifications, it can also be reproduced on x86.
@rustbot label +S-has-mcve +A-linkage
I think this can be now closed, since #131448 has been backported and should have made it to the stable 1.82 release of tomorrrow.
Am I right @jonathanpallant ? Can you check if we're all good here? thanks :)
I can confirm nightly-2024-10-13 fixes it. nightly-2024-10-12 is the last one failing.
Beta works as well.
Upstream LLVM will not backport the relevant fixes, so I will add a test case.
Upstream LLVM will not backport the relevant fixes, so I will add a test case.
Do you think there is a chance that this "bug" will come back to us? We are currently evaluating if we should apply a fix/safeguard in defmt.
Upstream LLVM will not backport the relevant fixes, so I will add a test case.
Do you think there is a chance that this "bug" will come back to us? We are currently evaluating if we should apply a fix/safeguard in defmt.
No. The main branch of LLVM has fixed. The next major update will include it.
Upstream LLVM will not backport the relevant fixes, so I will add a test case.
Do you think there is a chance that this "bug" will come back to us? We are currently evaluating if we should apply a fix/safeguard in defmt.
No. The main branch of LLVM has fixed. The next major update will include it.
Perfect!
Lgtm
Also tried with the stable prerelease from https://internals.rust-lang.org/t/rust-1-82-0-pre-release-testing/21705. Problem is solved.
Code
I tried this code:
I expected to see this happen:
The binaries build like, they do on stable.
Instead, this happened:
That symbol is defined here: https://github.com/knurling-rs/defmt/blob/b7a89f56059fc66ac3c7cc3445d092829599e699/defmt/src/lib.rs#L365
Bisect
@Dirbaio said:
searched nightlies: from nightly-2024-07-30 to nightly-2024-10-02 regressed nightly: nightly-2024-08-01 searched commit range: https://github.com/rust-lang/rust/compare/f8060d282d42770fadd73905e3eefb85660d3278...28a58f2fa7f0c46b8fab8237c02471a915924fe5 regressed commit: https://github.com/rust-lang/rust/commit/0b5eb7ba7bd796fb39c8bb6acd9ef6c140f28b65
That's the LLVM 19 upgrade.