rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
98.39k stars 12.73k forks source link

LLVM produces SIGILL when enabling avx2 target feature on `x86_64-unknown-none` #117938

Open japaric opened 12 months ago

japaric commented 12 months ago

Steps to reproduce

$ cargo new --lib repro
$ cd repro

$ echo '#![no_std]' > src/lib.rs
$ cargo add poly1305@0.8.0

$ rustup default 1.73.0
$ rustup target add x86_64-unknown-none
$ cargo b --target x86_64-unknown-none
error: could not compile `poly1305` (lib)

Caused by:
  process didn't exit successfully: `$RUSTUP_TOOLCHAIN/bin/rustc (..)` (signal: 4, SIGILL: illegal instruction)

Running gdb --args $RUSTC_INVOCATION_PRINTED_BY_CARGO produces this backtrace:

Stable Backtrace

``` Thread 7 "opt cgu.1" received signal SIGILL, Illegal instruction. [Switching to Thread 0x7fffe11ff6c0 (LWP 102419)] 0x00007ffff13c939f in llvm::X86TargetLowering::ReplaceNodeResults(llvm::SDNode*, llvm::SmallVectorImpl&, llvm::SelectionDAG&) const [clone .cold.0] () from /home/japaric/.rustup/toolchains/1.73.0-x86_64-unknown-linux-gnu/bin/../lib/../lib/libLLVM-17-rust-1.73.0-stable.so (gdb) backtrace #0 0x00007ffff13c939f in llvm::X86TargetLowering::ReplaceNodeResults(llvm::SDNode*, llvm::SmallVectorImpl&, llvm::SelectionDAG&) const [clone .cold.0] () from /home/japaric/.rustup/toolchains/1.73.0-x86_64-unknown-linux-gnu/bin/../lib/../lib/libLLVM-17-rust-1.73.0-stable.so #1 0x00007ffff1130ef5 in llvm::DAGTypeLegalizer::SplitVectorResult(llvm::SDNode*, unsigned int) [clone .cold.0] () from /home/japaric/.rustup/toolchains/1.73.0-x86_64-unknown-linux-gnu/bin/../lib/../lib/libLLVM-17-rust-1.73.0-stable.so #2 0x00007ffff000d41a in llvm::DAGTypeLegalizer::run() () from /home/japaric/.rustup/toolchains/1.73.0-x86_64-unknown-linux-gnu/bin/../lib/../lib/libLLVM-17-rust-1.73.0-stable.so #3 0x00007ffff01ff81a in llvm::SelectionDAGISel::SelectBasicBlock(llvm::ilist_iterator, false, true>, llvm::ilist_iterator, false, true>, bool&) () from /home/japaric/.rustup/toolchains/1.73.0-x86_64-unknown-linux-gnu/bin/../lib/../lib/libLLVM-17-rust-1.73.0-stable.so #4 0x00007ffff04ff282 in llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&) () from /home/japaric/.rustup/toolchains/1.73.0-x86_64-unknown-linux-gnu/bin/../lib/../lib/libLLVM-17-rust-1.73.0-stable.so #5 0x00007ffff034fc0a in llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&) () from /home/japaric/.rustup/toolchains/1.73.0-x86_64-unknown-linux-gnu/bin/../lib/../lib/libLLVM-17-rust-1.73.0-stable.so #6 0x00007ffff034f4ee in (anonymous namespace)::X86DAGToDAGISel::runOnMachineFunction(llvm::MachineFunction&) [clone .llvm.6232165262612102610] () from /home/japaric/.rustup/toolchains/1.73.0-x86_64-unknown-linux-gnu/bin/../lib/../lib/libLLVM-17-rust-1.73.0-stable.so #7 0x00007ffff016d66a in llvm::FPPassManager::runOnModule(llvm::Module&) () from /home/japaric/.rustup/toolchains/1.73.0-x86_64-unknown-linux-gnu/bin/../lib/../lib/libLLVM-17-rust-1.73.0-stable.so #8 0x00007ffff05dcaac in llvm::legacy::PassManagerImpl::run(llvm::Module&) () from /home/japaric/.rustup/toolchains/1.73.0-x86_64-unknown-linux-gnu/bin/../lib/../lib/libLLVM-17-rust-1.73.0-stable.so #9 0x00007ffff64d36a6 in LLVMRustWriteOutputFile () from /home/japaric/.rustup/toolchains/1.73.0-x86_64-unknown-linux-gnu/bin/../lib/librustc_driver-453cf35e1dd187fa.so #10 0x00007ffff64d2558 in rustc_codegen_llvm[13e834ec38ef84a5]::back::write::write_output_file () from /home/japaric/.rustup/toolchains/1.73.0-x86_64-unknown-linux-gnu/bin/../lib/librustc_driver-453cf35e1dd187fa.so #11 0x00007ffff64cfcd4 in rustc_codegen_llvm[13e834ec38ef84a5]::back::write::codegen () from /home/japaric/.rustup/toolchains/1.73.0-x86_64-unknown-linux-gnu/bin/../lib/librustc_driver-453cf35e1dd187fa.so #12 0x00007ffff64cd074 in rustc_codegen_ssa[1239057ba2d16fcb]::back::write::finish_intra_module_work:: () from /home/japaric/.rustup/toolchains/1.73.0-x86_64-unknown-linux-gnu/bin/../lib/librustc_driver-453cf35e1dd187fa.so #13 0x00007ffff64cc75d in rustc_codegen_ssa[1239057ba2d16fcb]::back::write::execute_optimize_work_item:: () from /home/japaric/.rustup/toolchains/1.73.0-x86_64-unknown-linux-gnu/bin/../lib/librustc_driver-453cf35e1dd187fa.so #14 0x00007ffff64ca627 in std[3759e478f3a6c4f2]::sys_common::backtrace::__rust_begin_short_backtrace::<::spawn_named_thread::{closure#0}, ()>::{closure#0}, ()> () from /home/japaric/.rustup/toolchains/1.73.0-x86_64-unknown-linux-gnu/bin/../lib/librustc_driver-453cf35e1dd187fa.so #15 0x00007ffff6459256 in <::spawn_unchecked_<::spawn_named_thread::{closure#0}, ()>::{closure#0}, ()>::{closure#1} as core[d28c4e8d9c4eebaa]::ops::function::FnOnce<()>>::call_once::{shim:vtable#0} () from /home/japaric/.rustup/toolchains/1.73.0-x86_64-unknown-linux-gnu/bin/../lib/librustc_driver-453cf35e1dd187fa.so #16 0x00007ffff3d71295 in alloc::boxed::{impl#47}::call_once<(), dyn core::ops::function::FnOnce<(), Output=()>, alloc::alloc::Global> () at library/alloc/src/boxed.rs:2007 #17 alloc::boxed::{impl#47}::call_once<(), alloc::boxed::Box, alloc::alloc::Global>, alloc::alloc::Global> () at library/alloc/src/boxed.rs:2007 #18 std::sys::unix::thread::{impl#2}::new::thread_start () at library/std/src/sys/unix/thread.rs:108 #19 0x00007ffff3ae59eb in ?? () from /usr/lib/libc.so.6 #20 0x00007ffff3b697cc in ?? () from /usr/lib/libc.so.6 ```


Using nightly-2023-11-15 toolchain produces a "LLVM ERROR" instead:

$ cargo +nightly-2023-11-15 b --target x86_64-unknown-none
LLVM ERROR: Do not know how to split the result of this operator!

error: could not compile `poly1305` (lib)

Unless the --release flag is used, then you get the SIGILL with the nightly toolchain. The backtrace appears to be similar to the stable toolchain one:

Nightly Backtrace

``` (gdb) backtrace #0 0x00007ffff12744ca in llvm::X86TargetLowering::ReplaceNodeResults(llvm::SDNode*, llvm::SmallVectorImpl&, llvm::SelectionDAG&) const [clone .cold.0] () from /home/japaric/.rustup/toolchains/nightly-2023-11-15-x86_64-unknown-linux-gnu/bin/../lib/../lib/libLLVM-17-rust-1.76.0-nightly.so #1 0x00007ffff133a1ce in llvm::DAGTypeLegalizer::SplitVectorResult(llvm::SDNode*, unsigned int) [clone .cold.0] () from /home/japaric/.rustup/toolchains/nightly-2023-11-15-x86_64-unknown-linux-gnu/bin/../lib/../lib/libLLVM-17-rust-1.76.0-nightly.so #2 0x00007ffff020e085 in llvm::DAGTypeLegalizer::run() () from /home/japaric/.rustup/toolchains/nightly-2023-11-15-x86_64-unknown-linux-gnu/bin/../lib/../lib/libLLVM-17-rust-1.76.0-nightly.so #3 0x00007ffff03a8ada in llvm::SelectionDAGISel::CodeGenAndEmitDAG() () from /home/japaric/.rustup/toolchains/nightly-2023-11-15-x86_64-unknown-linux-gnu/bin/../lib/../lib/libLLVM-17-rust-1.76.0-nightly.so #4 0x00007ffff09c09b8 in llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&) () from /home/japaric/.rustup/toolchains/nightly-2023-11-15-x86_64-unknown-linux-gnu/bin/../lib/../lib/libLLVM-17-rust-1.76.0-nightly.so #5 0x00007ffff05f27fa in llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&) () from /home/japaric/.rustup/toolchains/nightly-2023-11-15-x86_64-unknown-linux-gnu/bin/../lib/../lib/libLLVM-17-rust-1.76.0-nightly.so #6 0x00007ffff05f2016 in (anonymous namespace)::X86DAGToDAGISel::runOnMachineFunction(llvm::MachineFunction&) [clone .llvm.4022770523405222600] () from /home/japaric/.rustup/toolchains/nightly-2023-11-15-x86_64-unknown-linux-gnu/bin/../lib/../lib/libLLVM-17-rust-1.76.0-nightly.so #7 0x00007ffff034e3c1 in llvm::FPPassManager::runOnFunction(llvm::Function&) () from /home/japaric/.rustup/toolchains/nightly-2023-11-15-x86_64-unknown-linux-gnu/bin/../lib/../lib/libLLVM-17-rust-1.76.0-nightly.so #8 0x00007ffff034d947 in llvm::FPPassManager::runOnModule(llvm::Module&) () from /home/japaric/.rustup/toolchains/nightly-2023-11-15-x86_64-unknown-linux-gnu/bin/../lib/../lib/libLLVM-17-rust-1.76.0-nightly.so #9 0x00007ffff04e317a in llvm::legacy::PassManagerImpl::run(llvm::Module&) () from /home/japaric/.rustup/toolchains/nightly-2023-11-15-x86_64-unknown-linux-gnu/bin/../lib/../lib/libLLVM-17-rust-1.76.0-nightly.so #10 0x00007ffff6b745d0 in LLVMRustWriteOutputFile () from /home/japaric/.rustup/toolchains/nightly-2023-11-15-x86_64-unknown-linux-gnu/bin/../lib/librustc_driver-a2929300a34289e9.so #11 0x00007ffff6b7420c in rustc_codegen_llvm[e0f834ca461547f0]::back::write::write_output_file () from /home/japaric/.rustup/toolchains/nightly-2023-11-15-x86_64-unknown-linux-gnu/bin/../lib/librustc_driver-a2929300a34289e9.so #12 0x00007ffff6b71bdf in rustc_codegen_llvm[e0f834ca461547f0]::back::write::codegen () from /home/japaric/.rustup/toolchains/nightly-2023-11-15-x86_64-unknown-linux-gnu/bin/../lib/librustc_driver-a2929300a34289e9.so #13 0x00007ffff6b7187f in rustc_codegen_ssa[130828829af41105]::back::write::finish_intra_module_work:: () from /home/japaric/.rustup/toolchains/nightly-2023-11-15-x86_64-unknown-linux-gnu/bin/../lib/librustc_driver-a2929300a34289e9.so #14 0x00007ffff6ccb32b in std[14019a58b7d275f1]::sys_common::backtrace::__rust_begin_short_backtrace::<::spawn_named_thread::{closure#0}, ()>::{closure#0}, ()> () from /home/japaric/.rustup/toolchains/nightly-2023-11-15-x86_64-unknown-linux-gnu/bin/../lib/librustc_driver-a2929300a34289e9.so #15 0x00007ffff6cca7e4 in <::spawn_unchecked_<::spawn_named_thread::{closure#0}, ()>::{closure#0}, ()>::{closure#1} as core[a62a0f03b43184e2]::ops::function::FnOnce<()>>::call_once::{shim:vtable#0} () from /home/japaric/.rustup/toolchains/nightly-2023-11-15-x86_64-unknown-linux-gnu/bin/../lib/librustc_driver-a2929300a34289e9.so #16 0x00007ffff1f98915 in alloc::boxed::{impl#47}::call_once<(), dyn core::ops::function::FnOnce<(), Output=()>, alloc::alloc::Global> () at library/alloc/src/boxed.rs:2007 #17 alloc::boxed::{impl#47}::call_once<(), alloc::boxed::Box, alloc::alloc::Global>, alloc::alloc::Global> () at library/alloc/src/boxed.rs:2007 #18 std::sys::unix::thread::{impl#2}::new::thread_start () at library/std/src/sys/unix/thread.rs:108 #19 0x00007ffff1d899eb in ?? () from /usr/lib/libc.so.6 #20 0x00007ffff1e0d7cc in ?? () from /usr/lib/libc.so.6 ```

Meta

Downstream discussion: https://github.com/RustCrypto/universal-hashes/issues/189

rustc +1.73.0 --version --verbose

rustc 1.73.0 (cc66ad468 2023-10-03)
binary: rustc
commit-hash: cc66ad468955717ab92600c770da8c1601a4ff33
commit-date: 2023-10-03
host: x86_64-unknown-linux-gnu
release: 1.73.0
LLVM version: 17.0.2

rustc +nightly-2023-11-15 --version --verbose

rustc 1.76.0-nightly (dd430bc8c 2023-11-14)
binary: rustc
commit-hash: dd430bc8c22f57992ec1457a87437d14283fdd65
commit-date: 2023-11-14
host: x86_64-unknown-linux-gnu
release: 1.76.0-nightly
LLVM version: 17.0.5
nikic commented 12 months ago

With assertions:

SplitVectorResult #0: t41: v8i32 = llvm.x86.avx2.psllv.d.256 TargetConstant:i64<12153>, t28, t39, /rustc/dd430bc8c22f57992ec1457a87437d14283fdd65/library/core/src/../../stdarch/crates/core_arch/src/x86/avx2.rs:2718:15
nikic commented 12 months ago

@llvm.x86.avx2.psllv.d.256 is called inside @_ZN4core9core_arch3x864avx217_mm256_sllv_epi3217ha2b3f3fbfaa54a1bE with these attributes:

attributes #3 = { inlinehint noredzone nounwind nonlazybind "probe-stack"="inline-asm" "target-cpu"="x86-64" "target-features"="-mmx,-sse,-sse2,-sse3,-ssse3,-sse4.1,-sse4.2,-3dnow,-3dnowa,-avx,-avx2,+soft-float,+avx2" }

The combination of +soft-float and +avx2 is not supported. rustc probably needs to explicitly remove that target feature when compiling functions that enabled FP target features.

newpavlov commented 12 months ago

@nikic The code in question uses autodetection, so it has two branches: "soft" and SIMD-based. Autodetection is done using the cpufeatures crate. On targets like x86_64-unknown-none we effectively get the following code:

#[target_feature(enable = "avx2")]
unsafe fn simd_fn() { ... }

if false {
   unsafe { simd_fn() }
} else {
    soft_fn();
}

The idea here is that the first branch should be eliminated by compiler. But it looks like the compiler starts compiling and lowering simd_fn before branch elimination takes place, which triggers SIGILL with enabled soft floats.

Noratrieb commented 12 months ago

The compiler doesn't eliminate branches like these in debug mode right now (which, while it does cause perf problems sometimes, is not a bug).

newpavlov commented 12 months ago

Yes, but note that the SIGILL happens with --release. In debug mode it's fine (though far from ideal) to get "LLVM ERROR: Do not know how to split the result of this operator!".

nikic commented 12 months ago

I checked, and the --release build failure comes down to essentially the same thing. It's still due to functions with +soft-float,+avx2 target features, just catching a different assertion.

newpavlov commented 12 months ago

Can Rust remove +soft-float inherited from target definition for code which explicitly enables FP-dependent target features?

nikic commented 12 months ago

That should be possible, with two caveats:

cc @RalfJung I'm sure you will appreciate this new bit of target feature fun.

newpavlov commented 12 months ago

As noted in the linked RustCrypto issue, ideally we need something like this to properly handle nastiness like this in libraries. But I guess it's a separate discussion.

RalfJung commented 12 months ago

I'd say compiling SIMD code on a softfloat target makes fairly little sense. Maybe we should have cfg(hardfloat) so that these functions can be entirely removed on softfloat targets?

Letting you disable target features in a function wouldn't really help, we'd still want to reject even declaring such a function since its ABI is all wrong. (Or we'd have to get LLVM to support a softfloat ABI for SIMD types I guess.) See https://github.com/rust-lang/lang-team/issues/235 for more details on the ABI issues surrounding target features.

We definitely do not want to support soft-float in target_feature, neither positively nor negatively, due to its ABI impact. Soft-float vs hard-float is a target-wide decision that can't be altered on a per-function or even per-compilation-unit level. We currently accept some nonsense like -Ctarget-features=+soft-float on our hardfloat target but that's completely unsupported and pretty broken (you can cause UB in safe code due to ABI incompatibility), IMO we should reject such flags.

newpavlov commented 12 months ago

Maybe we should have cfg(hardfloat) so that these functions can be entirely removed on softfloat targets?

Yes, it could work, but I think a more fundamental solution would be a proper support of "negative" target features. Arguably, we should consider the relation between hard floats and SIMD instructions nothing more than an implementation detail of the x86 targets.

Another alternative is to replace SIMD functions (i.e. functions marked with #[target_feature(enable = "..")]) with placeholders on soft-float targets. The placeholders may panic, abort, or even be something like unreachable_unchecked. SIMD functions should not be reachable on soft-float targets in the first place, so, since calling them is UB, such replacement should be legal for compiler.

RalfJung commented 12 months ago

Yes, it could work, but I think a more fundamental solution would be a proper support of "negative" target features.

But what do you want to do with them? I don't think there is any way we can accept a +avx,-soft-float function on a softfloat target. Such a function has the wrong ABI and should just be rejected. This should be rejected with both -C and #[target_feature]; it's a good thing that the latter doesn't have this problem so we only need to fix the former.

Maybe after fixing all the LLVM issues around this we could accept this and give it a softfloat ABI. But that's far off.

And even then I'm not sure it is desired; in some cases people compiling for softfloat targets want to be really sure that the hardfloat registers are not used, since they plan to not save/restore them on context switches. Enabling hardfloat mode on a softfloat target is unsound in such situations.

Hardfloat/softfloat isn't just a regular target feature you can switch locally. It's a global decision. #[target_feature] has no business overwriting such global decisions.

tarcieri commented 12 months ago

Note that this issue impacts curve25519-dalek as well: https://github.com/dalek-cryptography/curve25519-dalek/issues/601

tarcieri commented 12 months ago

#[cfg(hardfloat)] would probably be sufficient to gate the relevant code, although a little annoying to sprinkle around everywhere

RalfJung commented 12 months ago

[cfg(hardfloat)] would probably be sufficient to gate the relevant code, although a little annoying to sprinkle around everywhere

Yeah... I just wasn't able to come up with a better alternative yet.

Maybe we should declare (and have the feature-detect macros implement) that SSE features are never available on softfloat targets. Then we can compile functions with SSE #[target_features] into unreachable_unchecked and so their ABI does not matter so we can generate whatever LLVM IR we want.

tarcieri commented 12 months ago

I think that's what @newpavlov was suggesting earlier. Sounds good to me.

Edit: specifically meant the second paragraph of https://github.com/rust-lang/rust/issues/117938#issuecomment-1813203536

newpavlov commented 12 months ago

@RalfJung

But what do you want to do with them?

I am talking about this proposal.

Right now this part of target specification:

"features": "-mmx,-sse,-sse2,-sse3,-ssse3,-sse4.1,-sse4.2,-3dnow,-3dnowa,-avx,-avx2,+soft-float"

from library point of view only means that the SIMD features are not enabled. Thus code which supports runtime detection of target features could assume that during execution SIMD features may be available, so it has to keep SIMD detection and SIMD-optimized branches.

In other words, right now Rust provides only two target feature states ?feature (target feature is not enabled, but may be present during execution) and +feature (target feature is enabled and can be freely used). Ideally, we need the third "negative" state -feature (alternatively, !feature), i.e. the target feature is not enabled and can not be available during execution. If a target feature is "negative", then libraries should remove autodetection branches which depend on such target feature.

Enabling soft floats would automatically make all SIMD features like SSE and AVX "negative", thus in libraries we will be properly to cfg them out.

RalfJung commented 12 months ago

I think that's what @newpavlov was suggesting earlier. Sounds good to me.

Ah yes they were. I somehow missed that -- sorry. Great, good to see designs converge :)

In other words, right now Rust provides only two target feature states ?feature (target feature is not enabled, but may be present during execution) and +feature (target feature is enabled and can be freely used). Ideally, we need the third "negative" state -feature (alternatively, !feature), i.e. the target feature is not enabled and can not be available during execution. If a target feature is "negative", then libraries should remove autodetection branches which depend on such target feature.

I don't understand what you mean. I thought you were suggesting #[target_feature("-soft-float")] fn ..., but now this sounds different?

In terms of enabled target features, there's the set of target features that are statically enabled in the current code (via #[target_feature] and -Ctarget-feature), which can be queried via cfg, and there's the set of target features that are dynamically enabled at runtime, which can be queried via is_x86_feature_detected! etc. The runtime set of features is always a superset of the compiletime set of features. Target features are either on or off, there's no "?" state.

(I don't think it is sound to use any other way of querying for runtime target features, it must be our macros. I hope cpufeatures uses those macros internally.)

newpavlov commented 12 months ago

I thought you were suggesting #[target_feature("-soft-float")] fn ..., but now this sounds different?

No, I did not mean -soft-float. I thought the linked proposal is clear enough, but I will try to explain it differently using a simplified example.

Target features are either on or off, there's no "?" state.

During compilation time, but there is also runtime. Target feature may be available during runtime, or could be guaranteed to not be available. Enabling soft floats on x86 makes SIMD features impossible during runtime.

Let's imagine we have foo_avx2 and foo_soft functions which do the same thing:

#[target_feature(enable = "avx2")]
fn foo_avx2(t: T) -> R { ... }

fn foo_soft(t: T) -> R { ... }

Now we want to provide a public function foo which will handle auto-detection if necessary. The most straightforward way to do it is to write:

pub fn foo(t: T) -> R {
    if is_x86_feature_detected!("avx2) {
        unsafe { foo_avx2(t) }
    } else {
        foo_soft(t)
    }
}

Now when the code is compiled with target-feature=+avx2 the "soft" branch gets eliminated because is_x86_feature_detected evaluates to true at compile time. But there is currently no way to eliminate the AVX2 branch! Compiling with target-feature=-avx2 obviously will have no effect.

With hypothetical target-feature=!avx2 the is_x86_feature_detected would evaluate to false at compile time, thus eliminating the AVX2 branch. This can be useful for reducing code size (e.g. if we are certain that application will not run on CPUs with AVX2), eliminating overhead of autodetection, and for testing different backends.

Right now, in RustCrypto we have to keep a bunch of custom configuration flags, which is... a really subpar solution.

We also can imagine an alternative implementation of foo:

cfg_if::cfg_if! {
    if #[cfg(target_feature = "avx2")] {
        pub fn foo(t: T) -> R { unsafe { foo_avx2(t) } }
    } else if #[cfg(target_feature = "!avx2")] {
        fn foo(t: T) -> R) { foo_soft(t) }
    } else if #[cfg(target_feature = "?avx2")] {
        pub fn foo(t: T) -> R {
            if is_avx2_availble() {
                unsafe { foo_avx2(t) }
            } else {
                foo_soft(t)
            }
        }
    } else {
        // unreachable
    }
}
RalfJung commented 12 months ago

Ah I see. So basically you want a static upper bound of the dynamic set of target features, complementing the static lower bound that we already have.

And if I understand correctly you'd want that not just on a per-target basis (where softfloat targets "know" that they never have AVX), but controllable via -Ctarget-feature? That sounds like it could cause very bad linker issues through if you link code that uses -Ctarget-feature=!avx2 with code that doesn't use this... I don't see how that could be made sound. For instance, the standard library might be using is_x86_feature_detected!("avx2") somewhere, and now if you build your crate with -Ctarget-feature=!avx2 the resulting binary would be incoherent: parts of it might assume that AVX2 is present while other parts do not.

newpavlov commented 12 months ago

you'd want that not just on a per-target basis (where softfloat targets "know" that they never have AVX), but controllable via -Ctarget-feature?

Yes. But I don't agree that !feature can cause soundness issues. They may arise only if you have different ABIs between linked crates and, in principle, in this regard !feature is not different from ?feature (i.e. -Ctarget-feature=-feature). IIUC if you compile today a crate with -sse,-sse2 (or with +soft-float), you can not safely link it with a standard binary distribution of std.

For me !feature is mostly a library-level feature which introduces a centralized interface for tweaking behavior of autodetection-capable code. In some cases it may be even beneficial to "disable" AVX2 (!avx2) for one crate, but leave it enabled (?avx2) for another, e.g. if the former crates causes undesirable bloat. The improvement of +soft-float handling is a great bonus on top.

The potential incoherency with std would be indeed unfortunate and unexpected, but I don't think it's a critical issue. I think we can live with it, but it also may be possible to fix or work around.

The easiest solution would be to mandate that std should not use target feature detection macros. Right now, std does not use it, so no changes will be necessary. It may be used by std dependencies, but they could cfg it out based on the standard rustc-dep-of-std feature, unless it's something performance critical.

Alternatively, it may be possible to fix it by tweaking implementation of is_x86_feature_detected!. For example, as a vague suggestion, std could have a static pointer. By default it would be null, but when application is built with !feature, the compiler would additionally link a table of "negative" features and post-linking will change value of the static pointer to point to this table. is_x86_feature_detected! would change its behavior accordingly based on the pointer. It would not eliminate potential autodetection branches from std and its deps, but will "passivize" them.

RalfJung commented 12 months ago

But I don't agree that !feature can cause soundness issues.

That depends on what !feature means. If it is used with the goal of producing binaries that definitely do not use certain registers, then it can cause soundness issues. This could possibly be fixed with clear docs on what !feature really means, but we'd have to make really sure that people are aware of this.

Also, if is_x86_feature_detected!("feature") == false is interpreted as a guarantee that "the feature is not available", then having it return false in some places and true in other places can introduce soundness issues. I could easily imagine unsafe code out there already assuming that is_x86_feature_detected! is consistent within a single binary, so your proposal is a potentially breaking change.

RalfJung commented 12 months ago

Anyway this is getting highly speculative and it's discussing a significant language extension. If we want to continue discussing this we should start a new thread. It's not really relevant for this issue.

The core of this issue is that enabling certain features on certain targets just doesn't work currently, which leads to portability issues.

briansmith commented 11 months ago

And even then I'm not sure it is desired; in some cases people compiling for softfloat targets want to be really sure that the hardfloat registers are not used, since they plan to not save/restore them on context switches.

We want to make sure they aren't used unless specifically requested.

Enabling hardfloat mode on a softfloat target is unsound in such situations.

Not necessarily. The Linux kernel has kernel_fpu_begin() and kernel_fpu_end() with which you must wrap your vector-register-using code. Every operating system kernel will likely have something equivalent because they need their in-kernel crypto code to be able to use vector registers (indeed, this is what the Linux kernel crypto code does). Unfortunately, the current design of these targets doesn't seem to expose enough information for us to know which environment we're in. Ideally we'd have target_env="linuxkernel" or something so we could discover what we need to do using cfg.

I do agree that #[target_feature(enable = "avx2")] and the like need to be able to work on these -none targets and that if these targets are going to be +softfloat then softfloat can't be mutually exclusive with using target_feature to enable vector instruction use.

The discussion of "negative" features is a totally separate thing. What's really happening is that the crypto libraries are using CPUID/_xgetbv to detect CPU features, and if CPUID/_xgetbv says some CPU feature is available, then they feel free to use it. This is not the correct thing for us to be doing on these -none targets and it's not something for the language team to solve. It should be tracked in a separate issue; there's already https://github.com/rust-lang/rust/issues/60123#issuecomment-1804704684 where I point out why the proposed has_cpuid misleads us regarding this.

RalfJung commented 11 months ago

I do agree that #[target_feature(enable = "avx2")] and the like need to be able to work on these -none targets and that if these targets are going to be +softfloat then softfloat can't be mutually exclusive with using target_feature to enable vector instruction use.

That needs work on the LLVM side then, since currently this is not supported in LLVM.

RalfJung commented 11 months ago

And that also complicates the ABI story. Given the principle that target features may not affect ABI (https://github.com/rust-lang/lang-team/issues/235), I guess we need to

I don't know if LLVM supports the first point here. If the goal is to eventually allow using such target features on softfloat targets, then we should reject these target features until we have a way to enable them without affecting ABI. But that would mean there is no portable way to write code like what triggered this issue until LLVM is fixed...

The fact that LLVM ties together "target features used by ABI" and "target features used by codegen" makes this hard to support. But I don't think we should compromise on having a consistent ABI within any given target triple.

briansmith commented 11 months ago

The fact that LLVM ties together "target features used by ABI" and "target features used by codegen" makes this hard to support. But I don't think we should compromise on having a consistent ABI within any given target triple.

Maybe it already supports this, since clang can build the Linux kernel? Or maybe they only do their SIMD stuff in external .S files?

tarcieri commented 11 months ago

Would CPUID-gating + asm! work on these targets?

newpavlov commented 11 months ago

What's really happening is that the crypto libraries are using CPUID/_xgetbv to detect CPU features, and if CPUID/_xgetbv says some CPU feature is available, then they feel free to use it. This is not the correct thing for us to be doing on these -none targets and it's not something for the language team to solve.

Not quite. In the cpufeatures crate we specifically gate on *-none, *-uefi, and *-sgx targets, which results in code described in this comment. The problem is that Rust/LLVM can not compile (unused) functions which enable SIMD target features (e.g. #[target_feature(enable = "avx2")]) for soft-float targets.

newpavlov commented 11 months ago

It may be a regression from 1.69 to 1.70. I was unable to trigger SIGILL on a simplified example, but this snippet gets properly compiled on 1.69, but causes LLVM ERROR on 1.70 and later.

briansmith commented 11 months ago

It may be a regression from 1.69 to 1.70. I was unable to trigger SIGILL on a simplified example, but this snippet gets properly compiled on 1.69, but causes LLVM ERROR on 1.70 and later.

Realistically, even if this were to be fixed in Rust 1.75, we'd need to find a workaround to avoid increasing MSRV for our projects to 1.75 (for these targets).

It seems instead we may need to "just" ensure that all the conditional logic that enables use of vector registers happens at #[cfg(...)] level instead of at cfg! level. We already have to do that to ensure, for example, we aren't trying to compile Aarch64 assembly on x86-64 targets and vice-versa.

So, I see this being two issues:

WDYT?