Open briansmith opened 6 months ago
WG-prioritization assigning priority (Zulip discussion).
Since in our support list this arch has the most support:
@rustbot label -I-prioritize +P-critical
From today's comment in the upstream LLVM issue:
For -mcpu=xyz, we enable the maximal set of features for the cpu (so long as they are relatively common), which can be disabled with +nofeat. [....] The idea is that users get decent performance by default, and if they have less features can turn down the default.
Unfortunately GCC didn't follow that scheme at the time for crypto instructions, and had them disabled by default. Clang did, so there was a difference in whether crypto was enabled. We did not decide to retro-actively change old CPU definitions (it could be a breaking change), but going forward "Armv-9" cpus have been changed to not include crypto by default.
Assuming that is accurate, there are a few interesting things:
-C target_cpu
to be safe by default, rustc cannot delegate its defaults to LLVM. rustc should turn off feature flags for optional features by default. Basically -C target_cpu
is generally not a memory-safe option. This should be documented retroactively for old versions and fixed for newer versions.Also, from the LLVM issue:
For -march=armv8.x-a we enable the minimal set of features (so all required extensions). Optional features can be added with +feat.
So, one workaround would be to use that. But, I don't see rustc providing a mechanism to choose the ARM architecture level instead of a specific CPU.
@rustbot label -P-critical +P-high
We aren't going to block the next stable release because of this issue so tagging this as P-high instead.
- Assuming we want
-C target_cpu
to be safe by default, rustc cannot delegate its defaults to LLVM. rustc should turn off feature flags for optional features by default. Basically-C target_cpu
is generally not a memory-safe option. This should be documented retroactively for old versions and fixed for newer versions.
Mmh. If I run the Rust compiler with -Ctarget-cpu=some-intel-cpu
and then execute it on my AMD machine, the result is (often) memory unsafe.
I do not believe it is realistic for rustc to "correct" LLVM on the very long list of AArch64 CPUs, so it would be best to fix this upstream.
Mmh. If I run the Rust compiler with -Ctarget-cpu=some-intel-cpu and then execute it on my AMD machine, the result is (often) memory unsafe.
If you do -C target-cpu=skylake
, and you run it on a skylake machine, do you get memory unsafety? That's the situation with -C target_cpu=cortex-a72
today.
Even worse, if you compile on a Pi4 with -C target_cpu=native
, and execute it straight-away, it's unsafe.
If you do
-C target-cpu=skylake
, and you run it on a skylake machine, do you get memory unsafety? That's the situation with-C target_cpu=cortex-a72
today.
If you build a binary with --target i686-unknown-linux-gnu
and then run it on an actual Pentium 6, yes, you do.
if you do
-C target-cpu=skylake
, and you run it on a skylake machine, do you get memory unsafety? That's the situation with-C target_cpu=cortex-a72
today.If you build a binary with --target i686-unknown-linux-gnu and then run it on an actual Pentium 6, yes, you do.
That's a different situation, because you're building for a CPU with more capabilities than what you're running on. In the case of -C target_cpu=cortex-a72
, when you run it on some cortex-a72 CPUs, i.e. the same model that you asked for, it will execute instructions that aren't implemented on that cortex-a72 CPU.
I do not believe it is realistic for rustc to "correct" LLVM on the very long list of AArch64 CPUs, so it would be best to fix this upstream.
Understandable. However, I think it is worth making an exception for this particular case:
That's a different situation, because you're building for a CPU with more capabilities than what you're running on.
No, it is not.
When you build code for the P6, you get code for a P68, a CPU model that was revised and upgraded from the P6.
When you build code for the Cortex-A72, you get code for a revised and upgraded Cortex-A72 model.
This is a very common issue because of the popularity of the Cotex-A72-based Raspberry Pi.
I assure you, there will be other SBCs that people will say are very popular and demand that we fix this sort of thing for them.
Someone in the upstream issue indicated that they've changed their policy for later CPUs, so this seems to be an issue that affects a short and not-growing list of CPUs. I.e. we probably wouldn't be signing up for doing this kind of thing over and over again.
LLVM has been willing to break compatibility with target features. I don't see why why we would believe their target CPUs are any more stable.
That's a different situation, because you're building for a CPU with more capabilities than what you're running on.
No, it is not.
When you build code for the P6, you get code for a P68, a CPU model that was revised and upgraded from the P6.
I mean, it's not exactly the same thing. P6 and P68 are architectures. It covers CPU designs that have been released over the course of, what, 10 years?
When you build code for the Cortex-A72, you get code for a revised and upgraded Cortex-A72 model.
Meanwhile, the Cortex-A72 is a CPU design. ARM doesn't produce any, and licensees are free to implement optional features, at their discretion. And note that it's a single design.
And we can talk about whether the Cortex-A72 definition should include various options or not all we want, but the fact that native
will produce binaries that can't run on the machine that compiled them, despite being exactly what it is used for and documented as, is clearly a bug.
This is a very common issue because of the popularity of the Cotex-A72-based Raspberry Pi.
I assure you, there will be other SBCs that people will say are very popular and demand that we fix this sort of thing for them.
If we stick to using only the required, by design, features of a given CPU design, then we won't have more demands.
If we stick to using only the required, by design, features of a given CPU design, then we won't have more demands.
Splendid! Then if that is all we need to do, would you care to PR your alternative database of CPU features for aarch64 CPU models to rustc, since LLVM's cannot be relied on and they apparently are not interested in changing that?
It seems like the key problem here is that -C target-cpu=native
does not work correctly -- regardless of any disagreements on naming, that part is certainly a bug in LLVM for which a fix would be accepted. This is probably just a matter of changing https://github.com/llvm/llvm-project/blob/22530e7985083032fe708848abb88b77be78e5ce/llvm/lib/TargetParser/Host.cpp#L1972 to always assign the crypto feature (instead of only assigning crypto = true).
@nikic Hm. Shouldn't that be done for the Windows cases as well? Indeed, all of the cases like that, with a static name?
if (condition)
Feature["name"] = true
Hm, I suppose there's only one in that file outside the AArch64 set.
heads up crabs, https://github.com/llvm/llvm-project/pull/95694 got merged, I'll let y'all figure out how to backport that or what.
workingjubilee added llvm-fixed-upstream
This isn't fixed upstream though? Maybe the target-cpu=native
case is fixed thanks to your effort but the target_cpu=cortex-a72
case is still an issue.
The -C target-cpu=native
behavior was a plain bug. The -C target-cpu=cortext-a72
behavior is just a difference in opinion.
@briansmith It's a separate issue because it applies to almost every single -Ctarget-cpu
that isn't -Ctarget-cpu=native
for AArch64. It requires a different solution, even if it is fixed upstream by successfully armwrestling LLVM into changing every single CPU def.
Building with
RUSTFLAGS="-C target_cpu=cortex-a72"
statically enables thetarget_feature="aes"
,target_feature="crc"
,target_feature="pmuv3"
, andtarget_feature="sha2"
. However, at least the crypto features AES, CRC, and SHA2 are optional on this CPU. The definition for this target is wrong. See the upstream LLVM bug: https://github.com/llvm/llvm-project/issues/90365.The main consequence is that crypto libraries that use
cfg(target_feature = ...)
feature detection for these hardware instructions are getting miscompiled, causing the programs to, at best, crash with an illegal instruction exception. This particular affects Raspberry Pi users compiling withRUSTLFAGS=-target-cpu=native
. From https://github.com/briansmith/ring/issues/1858#issuecomment-2093675988:I verified this is an issue on Rust 1.61 stable, 1.78 stable, and rustc 1.80.0-nightly (6e1d94708 2024-05-10).
Although some crypto libraries may work around this issue, these workarounds have negative consequences. In the case of ring's workaround, the result of the workaround is bloat and worse performance on all AArch64 CPUs that actually are guaranteed to have the crypto extensions (except on Fuchsia, Windows, and macOS).