Open japaric opened 8 months ago
for additional context ("why are you even using this target?"): I was looking for a built-in "OS agnostic" / no-std target for a no-std rustls demo (rustls/rustls PR1534) and tried x86_64-unknown-none and run into this and other problems. I ended up using a custom target for the demo but I figured I should still report what I found.
x86_64-unknown-none is a softfloat target so issues with code that uses hardware floating point operations is not too surprising. You seem to be looking for a no-OS hardfloat target, which I don't think we have.
FWIW both curve25519-dalek and the @RustCrypto crates impacted by this are written in a way where they have baseline pure Rust implementations that should work fine on softfloat targets, it's just that they also contain AVX2 implementations which are intended for other x86
/x86_64
targets that are currently difficult to gate/disable on softfloat targets.
Okay, so the question is "how do I best write target-specific AVX2 implementations in a portable way such that the code builds even on softfloat targets", but there is not a request here to be able to use AVX2 instructions on softfloat targets?
That sounds reasonable and yeah we currently don't have a solution. I think my current preference is what I just sketched in https://github.com/rust-lang/rust/issues/117938:
Maybe we should declare (and have the feature-detect macros implement) that SSE features are never available on softfloat targets. Then we can compile functions with SSE #[target_features] into unreachable_unchecked and so their ABI does not matter so we can generate whatever LLVM IR we want.
Yep, from the perspective of all of these crates the AVX2 code is effectively dead on softfloat targets. The problem is that rustc is still trying to compile it anyway.
Compiling the relevant code into e.g. unreachable_unchecked!
sounds good to me.
Steps to reproduce
using nightly, you get "LLVM ERROR: Do not know how to split the result of this operator!"
after some digging it seems that the lack of SSE2 instructions in the
x86_64-unknown-none
is the problem:creating a custom target that adds back the
sse2
feature makes the crate compilepatch the JSON like this
keeping the
-sse2
produces "LLVM ERROR: Access past stack top!"perhaps the
backend
module could use#[cfg(not(target_feature = "sse2"))]
to skip the runtime-detection and directly use the serial implementation? or maybe you could add something like this:which is nicer than the LLVM ERROR
for additional context ("why are you even using this target?"): I was looking for a built-in "OS agnostic" / no-std target for a no-std rustls demo (rustls/rustls PR1534) and tried
x86_64-unknown-none
and run into this and other problems. I ended up using a custom target for the demo but I figured I should still report what I found.