Open gnzlbg opened 5 years ago
Going with p-medium based on the discussion
Is this unique to Rust's ABI? or is this also a problem in C/C++?
@elichai C and C++ appear to handle this mostly via using the hardware vector types in question and then passing the rest of the burden on to the programmer, or avoiding the subject. This is solved by C and C++ programmers writing sophisticated SIMD libraries that opaquely handle the hardware details, providing a higher level abstraction that other C and C++ programmers can use. Here, we can do a bit better than that.
For x86 SIMD (an obvious source of compatibility concerns here), AVX512, AVX, and SSE enable the zmm, ymm, and xmm registers, which support the _m512
, _m256
, and _m128
data types, respectively. While their actual usage in intrinsics is slightly more complex, as far as the System V AMD64 ABI is concerned, these are mostly just bags of that many bits with corresponding alignment. The first 16 registers are aliased for x86 CPUs such that, assuming all of these registers exist, xmm{N}
aliases a subset of ymm{N}
aliases a subset of zmm{N}
, starting at bit 0 for each. AVX512 enables another 16 registers but the rules are slightly more complicated there so I won't go into them. Actually operating on the data as if it was a certain type requires specific features enabled, but that's essentially a concern about the callee and already addressed by existing target feature rules. This soundness hole is instead about what happens at the moment the caller picks up the phone, having filled a vector with bits.
Given #81931, #35093, and other issues that basically amount to "the compiler is not generating efficient or correct code because it is not really aware of the capabilities of the target in a meaningful fashion, even though it is functionally required to use a target as an argument to compilation", I agree generically with the plan outlined here and think that the best thing to do is to teach the compiler about the reality of what it is staring down, which will allow it to both recognize ABI compatibilities and emit more efficient code.
Isn't calling a function with additional enabled target_feature
s unsafe
? It seems like crossing -C target-feature
bounderies in the same case, should likewise be unsafe, or, at the very least, warned against (possibly a future compat warning).
If the information to track target features is available, it should also be possible to enable the correct ABI for the safe case.
In either case, the safe version is still unsound, because you've just passed into code that enables previously disabled target features. If it is not the case that they are available, then it would trap (or result in UB), just like it would if the function in the second crate was explicitly #[target_feature(enable="sse")]
.
it's not a matter of what is allowed by the CPU, an ABI mismatch will cause problems even if the CPU supports the target feature.
So the root of the problem is that you can't check at runtime if the function you're calling was compiled with non-default target features assumed or not.
For extern "Rust"
the abi stays the same independent of the enabled target features. Vector types are forced to be passed by-ref to avoid abi incompatibility: https://github.com/rust-lang/rust/blob/d04c3aa8656f6588c87bafafb34d51239dab98bb/compiler/rustc_middle/src/ty/layout.rs#L2881-L2906
So the root of the problem is that you can't check at runtime if the function you're calling was compiled with non-default target features assumed or not
Compilers should be able to detect this (using the same, or a similar mechanism to the check for #[target_feature]
), then require unsafe
(and, if necessary, adjust the call abi at the call site). I was considering the same for lccc, which includes a WIP implementation of rust, and it's ABI spec. The solution I chose was to maintain the ABI as-is, use the ABI of the callee, and warn when crossing -C target-features
bounderies in ways that affect ABI.
The ABI issue still exists for extern"C"
functions, and you're still crossing target_feature bounderies in safe code regardless.
@bjorn3 but the example uses f32
, which isn't a vector type.
@chorman0773 well example uses extern
, so the compiler that's compiling the local crate can't know if the other crate actually did it or not.
Though really this issue isn't different from saying extern "C"
when you needed to say extern "stdcall"
. any way to make an ABI mismatch can cause these problems.
well example uses extern, so the compiler that's compiling the local crate can't know if the other crate actually did it or not.
The example using extern
blocks requires unsafe, anyways. However, this is safe:
// crate A: compiled with -C target-feature=-sse
pub extern"C" fn foo(x: f32) -> f32 { x }
// crate B: compiled with -C target-feature=+sse
// this is now safe Rust code:
assert_eq!(A::foo(42.0), 42.0) }; // UB
(replace extern"C"
with any ABI, except extern"Rust"
, extern"rust-intrinsic"
, extern"platform-intrinsic"
, or extern"rust-call"
).
My argument here is that the ABI problem shouldn't be an issue, the problem is the fact you can cross target_feature domains safely. In any case that can currently be done with safe code, the compiler should be able to adjust ABI for all versions. In the cases it cannot, you need unsafe anyways and thus are responsible for the invariants.
The only issue with my implementation choice is fn-pointer types, but, again, this is an issue with any ABI that rust implementations cannot control, and still can be used by safe code.
That function does not compile when targeting the x86_64 platform.
error: <source>:2:42: in function _ZN7example3foo17hd3a315f1768df74cE float (float): SSE register return with SSE disabled
error: aborting due to previous error
If it did, it would actually violate the SystemV AMD64 C ABI, and also would violate the MSVC x64 ABI, because both of those specify that you are supposed to pass float arguments using xmm registers.
Please limit your concerns to ones that can be actualized, though if you have an example that does, I am obviously very interested.
Huh (nice, llvm being smart), though would that not have applied originally to extern"Rust"
? In any case, my point about crossing target_feature bounderies still applies, though. Is it not UB to enter code compiled for one target feature when that feature is not available at runtime?
Update, I was able to compile a simple case in #![no_std]
for the i586-unknown-linux-gnu target, with -C target-cpu=core2
and -C target-feature=-x87
in the caller crate.
The build line was RUSTFLAGS="-C target-cpu=core2" cargo rustc -Z build-std=core,compiler_builtins --target i586-unknown-linux-gnu -- -C target-feature=-x87
, building bar
.
The caller crate, bar
is:
#![no_std]
fn main() {
foo::foo(5.0f32);
}
and the callee crate, foo
, is:
#![no_std]
// Type your code here, or load an example.
pub extern"C" fn foo(f: f32) -> f32{
f
}
From Compiler Explorer https://godbolt.org/z/75nKsaWcx, the abi of extern"C" foo differs between -C target-feature=-x87
and -C target-feature=+x87
(default).
Well, the concern which uses the incompatible FPU example is on the i586 platforms that aren't +sse by default, yes, so you don't need to disable it there. So perhaps I should really be saying, "please be more precise."
Much of the solution is indeed to teach the compiler to adjust automatically during compilation where it makes sense to do that. It is not a tidy or simple feat, however, or someone would have already done it. And outlining "where it makes sense to do that" is the first step.
I've fixed the example after consulting the Sys-V Intel-386 psABI document here
In a nutshell, target-features are part of the call ABI, but Rust does not take that into account, and that's the underlying issue causing #63466, #53346, and probably others (feel free to refer them here).
For example, for an x86 target without SSE linking these two crates shows the issue:
The ABI of
B::foo
is"sse" "Rust"
but the ABI defined inA::foo
is just"Rust"
, no SSE, since the SSE feature is not enabled globally. That's an ABI mismatch and results in UB of the form "calling a function with an incompatible call ABI". For this particular case,B::foo
expects thef32
in an xmm register, whileA::foo
expects it in an x87 register, so the roundtrip of42.0
viafoo
will return garbage.Now, this example is not unsound, because it requires an incompatible function declaration, and unsafe code to call it - and arguably, the
unsafe
asserts that the declaration is, at least correct (note: we forbid assigning#[target_feature]
functions to function pointers and only allow enabling features and using white-listed features because of this).However, you can cause the exact same issue, but on a larger scale, by using
-C target-feature
/-C target-cpu
to change the features that a Rust crate assumes the"Rust"
ABI has, without any unsafe code:So that's the soundness bug. Safe Rust can exhibit undefined behavior of the form "calling a function with an incompatible call ABI".
This is an issue, because when
RUSTFLAGS="-C target-cpu=native"
is used, not all crates in the dependency graph are compiled with those features. In particular,libstd
and its dependencies are not recompiled at all, so their "Rust" ABI might be different than what the rest of the dependency graph uses.-C target-feature
also allows disabling features,-C target-feature/target-cpu
allow enabling/disabling features that are not white-listed (e.g. avx512f if your CPU supports it can be enabled using-C target-feature
and will be enabled using-C target-cpu=skylake
or=native
even though theavx512f
feature is unstable).How important is fixing this ? I'd say moderately important, because many features are compatible. For example, the
"sse2" "Rust"
ABI has the same calling convention as the"sse3" "Rust"
, so even though the call ABIs aren't identical, they are compatible. That is, this bug is unlikely to cause issues in practice, unless you happen to end up with two crates where the difference in target features changes the ABI.I think that rustc should definitely detect that the ABIs are incompatible, and at least, error at compile-time when this happen, explaining what went wrong, which target-features differed in an incompatible way, etc.
We could make such code work, e.g., by just treating target-features as part of the calling convention, and following the calling convention properly. I don't think fixing this would be useful for people doing
-C target-feature
globally, because when that happens, chances are that they wanted to compile their whole binary in a specific way, as opposed to having different parts of the binary somehow interoperate.It would however be useful for improving how we currently handle SIMD vectors. For example, a vector types like f64x4 will be passed in different registers (2x 128-bit or 1x 256-bit) depending on whether sse or avx are available. We currently "solve" this problem by always passing vectors to functions indirectly by memory. That is, on the caller side we spill the vector registers into the stack, create a pointer to it, pass it to the callee, and then the callee loads the content into the appropriate vectors. Since target-features are not part of the calling convention, we always do this, as opposed to, only when the calling convention is incompatible. So having target-features be part of the calling convention might be a way to improve that.