Open Lokathor opened 5 years ago
Trying to classify chips as "has / lacks hardware support for floating point" (even if f32 and f64 are distinguished) is too coarse-grained and runs into too many edge cases to be useful IMO.
Since you mentioned sqrt as motivation: it is perfectly possible to implement the basic arithmetic operations in hardware but lack a hardware sqrt instructions, which means you definitely have "hardware floating point" but still need a libcall for sqrt. This is not hypothetical: some early MIPS chips (like the R2000 with FPU coprocessor) and Nyuzi both did this. Admittedly I don't believe this is a good trade-off for general-purpose processors these days and the majority of CPU architectures do provide a sqrt instruction if they , but (1) nobody forces designers to make good decisions and (2) specific domains might have different needs.
Other operations such as trigonometric functions are even more complicated, hardware implementations are far from the norm and even when they exist they rarely have enough precision to be used by default, but they're still widely used with -ffast-math.
IMO it might make more sense to expose whether specific operations (or small related groups of operations) have hardware support in a specific context (e.g., set of target features), but this should be done case-by-case and tailored to concrete use cases (such as sqrt-in-libcore).
Unfortunately that low level of granularity is the granularity that the LLVM target features deal in.
To be clear, if the target device has a hardware sqrt ability but it's not configured for the current build, because the user did something like "+soft-float" on their x86_64
, we must respect that. Which is why we need to divine what target features are being used for this build or not.
for sqrt specifically, rustc could just query llvm's TargetTransformInfo::haveFastSqrt
for sqrt specifically, rustc could just query llvm's
TargetTransformInfo::haveFastSqrt
I don't know LLVM, but it seems like it takes a type, so maybe it could tell us f32
and f64
separately, which would be good.
One use I'd like to put it to is #116226, as there are different fastest implementations for all four combinations of f32
or f64
having or not having sqrt
hardware acceleration.
Also noting this is useful beyond sqrt
detection, cortex-m-rt
uses the target string, detected via an env var, whether to enable/initialize the FPU at startup, by smuggling a non-feature cfg
flag:
conditional code: https://github.com/rust-embedded/cortex-m/blob/6b3a5b7fb95fe98fb05ddb4270d2d14709afe99c/cortex-m-rt/src/lib.rs#L554-L564
The Linux kernel and drivers make system calls faster by forbidding floating point operations. It seems to enforce that in Rust by using #![no_std]
and making some floating point Rust intrinsics panic.
Is this the only way that it enforces it?
If that's the only way it enforces it, then in addition to asking the compiler backend whether a specific floating point operation is accelerated, we'd also need to check whether those particular intrinsics have been set to panic. To be clear, if those intrinsics are set to panic, any floating point operations, including those not set to panic, should report that they are not hardware accelerated, because some are available in core
through core::intrinsics
.
Currently there seems to be no way to detect at compile time if the target will have hardware floating point support.
Depending on the target arch, this is sometimes expressed in LLVM as a feature named "hard-float", or as a feature named "soft-float", or even as features named "f" and "d". It's entirely possible for there to be hardware
f32
support but notf64
support on some platforms. LLVM has (or should have) all of this info already, based on the target profile. We just don't expose it in Rust.For the record, this is initially needed for libm/compiler-builtins to advance the issue of moving
sqrt
and other float support intocore
, so if this is added as some sort of Nightly-only and perma-unstable ability then that's probably fine since those crates are always built with Nightly and then shipped with the compiler.It feels like this could be a simple PR thing by someone who knows about that part of the cargo/LLVM/rustc interaction, but more likely this is some sort of RFC level change. However, it's also possible that I'm totally wrong and that there is already an arcane way to check for floating point support configuration already, so I'm starting with an issue to try and get some visibility for the problem.