-C target-feature/-C target-cpu are unsound

gnzlbg commented 5 years ago

In a nutshell, target-features are part of the call ABI, but Rust does not take that into account, and that's the underlying issue causing #63466, #53346, and probably others (feel free to refer them here).

For example, for an x86 target without SSE linking these two crates shows the issue:

// crate A
pub fn foo(x: f32) -> f32 { x }

// crate B
extern "Rust" {
     #[target_feature(enable = "sse")]  fn foo(x: f32) -> f32;
}
unsafe { assert_eq!(foo(42.0), 42.0) }; // UB

The ABI of B::foo is "sse" "Rust" but the ABI defined in A::foo is just "Rust", no SSE, since the SSE feature is not enabled globally. That's an ABI mismatch and results in UB of the form "calling a function with an incompatible call ABI". For this particular case, B::foo expects the f32 in an xmm register, while A::foo expects it in an x87 register, so the roundtrip of 42.0 via foo will return garbage.

Now, this example is not unsound, because it requires an incompatible function declaration, and unsafe code to call it - and arguably, the unsafe asserts that the declaration is, at least correct (note: we forbid assigning #[target_feature] functions to function pointers and only allow enabling features and using white-listed features because of this).

However, you can cause the exact same issue, but on a larger scale, by using -C target-feature/-C target-cpu to change the features that a Rust crate assumes the "Rust" ABI has, without any unsafe code:

// crate A: compiled without -C target-feature
pub fn foo(x: f32) -> f32 { x }

// crate B: compiled with -C target-feature=+sse
// this is now safe Rust code:
assert_eq!(A::foo(42.0), 42.0) }; // UB

So that's the soundness bug. Safe Rust can exhibit undefined behavior of the form "calling a function with an incompatible call ABI".

This is an issue, because when RUSTFLAGS="-C target-cpu=native" is used, not all crates in the dependency graph are compiled with those features. In particular, libstd and its dependencies are not recompiled at all, so their "Rust" ABI might be different than what the rest of the dependency graph uses. -C target-feature also allows disabling features, -C target-feature/target-cpu allow enabling/disabling features that are not white-listed (e.g. avx512f if your CPU supports it can be enabled using -C target-feature and will be enabled using -C target-cpu=skylake or =nativeeven though the avx512f feature is unstable).

How important is fixing this ? I'd say moderately important, because many features are compatible. For example, the "sse2" "Rust" ABI has the same calling convention as the "sse3" "Rust", so even though the call ABIs aren't identical, they are compatible. That is, this bug is unlikely to cause issues in practice, unless you happen to end up with two crates where the difference in target features changes the ABI.

I think that rustc should definitely detect that the ABIs are incompatible, and at least, error at compile-time when this happen, explaining what went wrong, which target-features differed in an incompatible way, etc.

We could make such code work, e.g., by just treating target-features as part of the calling convention, and following the calling convention properly. I don't think fixing this would be useful for people doing -C target-feature globally, because when that happens, chances are that they wanted to compile their whole binary in a specific way, as opposed to having different parts of the binary somehow interoperate.

It would however be useful for improving how we currently handle SIMD vectors. For example, a vector types like f64x4 will be passed in different registers (2x 128-bit or 1x 256-bit) depending on whether sse or avx are available. We currently "solve" this problem by always passing vectors to functions indirectly by memory. That is, on the caller side we spill the vector registers into the stack, create a pointer to it, pass it to the callee, and then the callee loads the content into the appropriate vectors. Since target-features are not part of the calling convention, we always do this, as opposed to, only when the calling convention is incompatible. So having target-features be part of the calling convention might be a way to improve that.

Dylan-DPC-zz commented 4 years ago

Going with p-medium based on the discussion

elichai commented 4 years ago

Is this unique to Rust's ABI? or is this also a problem in C/C++?

workingjubilee commented 3 years ago

@elichai C and C++ appear to handle this mostly via using the hardware vector types in question and then passing the rest of the burden on to the programmer, or avoiding the subject. This is solved by C and C++ programmers writing sophisticated SIMD libraries that opaquely handle the hardware details, providing a higher level abstraction that other C and C++ programmers can use. Here, we can do a bit better than that.

workingjubilee commented 3 years ago

For x86 SIMD (an obvious source of compatibility concerns here), AVX512, AVX, and SSE enable the zmm, ymm, and xmm registers, which support the _m512, _m256, and _m128 data types, respectively. While their actual usage in intrinsics is slightly more complex, as far as the System V AMD64 ABI is concerned, these are mostly just bags of that many bits with corresponding alignment. The first 16 registers are aliased for x86 CPUs such that, assuming all of these registers exist, xmm{N} aliases a subset of ymm{N} aliases a subset of zmm{N}, starting at bit 0 for each. AVX512 enables another 16 registers but the rules are slightly more complicated there so I won't go into them. Actually operating on the data as if it was a certain type requires specific features enabled, but that's essentially a concern about the callee and already addressed by existing target feature rules. This soundness hole is instead about what happens at the moment the caller picks up the phone, having filled a vector with bits.

Given #81931, #35093, and other issues that basically amount to "the compiler is not generating efficient or correct code because it is not really aware of the capabilities of the target in a meaningful fashion, even though it is functionally required to use a target as an argument to compilation", I agree generically with the plan outlined here and think that the best thing to do is to teach the compiler about the reality of what it is staring down, which will allow it to both recognize ABI compatibilities and emit more efficient code.

chorman0773 commented 3 years ago

Isn't calling a function with additional enabled target_features unsafe? It seems like crossing -C target-feature bounderies in the same case, should likewise be unsafe, or, at the very least, warned against (possibly a future compat warning). If the information to track target features is available, it should also be possible to enable the correct ABI for the safe case.

In either case, the safe version is still unsound, because you've just passed into code that enables previously disabled target features. If it is not the case that they are available, then it would trap (or result in UB), just like it would if the function in the second crate was explicitly #[target_feature(enable="sse")].

Lokathor commented 3 years ago

it's not a matter of what is allowed by the CPU, an ABI mismatch will cause problems even if the CPU supports the target feature.

So the root of the problem is that you can't check at runtime if the function you're calling was compiled with non-default target features assumed or not.

bjorn3 commented 3 years ago

For extern "Rust" the abi stays the same independent of the enabled target features. Vector types are forced to be passed by-ref to avoid abi incompatibility: https://github.com/rust-lang/rust/blob/d04c3aa8656f6588c87bafafb34d51239dab98bb/compiler/rustc_middle/src/ty/layout.rs#L2881-L2906

chorman0773 commented 3 years ago

So the root of the problem is that you can't check at runtime if the function you're calling was compiled with non-default target features assumed or not

Compilers should be able to detect this (using the same, or a similar mechanism to the check for #[target_feature]), then require unsafe (and, if necessary, adjust the call abi at the call site). I was considering the same for lccc, which includes a WIP implementation of rust, and it's ABI spec. The solution I chose was to maintain the ABI as-is, use the ABI of the callee, and warn when crossing -C target-features bounderies in ways that affect ABI. The ABI issue still exists for extern"C" functions, and you're still crossing target_feature bounderies in safe code regardless.

Lokathor commented 3 years ago

@bjorn3 but the example uses f32, which isn't a vector type.

@chorman0773 well example uses extern, so the compiler that's compiling the local crate can't know if the other crate actually did it or not.

Though really this issue isn't different from saying extern "C" when you needed to say extern "stdcall". any way to make an ABI mismatch can cause these problems.

chorman0773 commented 3 years ago

well example uses extern, so the compiler that's compiling the local crate can't know if the other crate actually did it or not.

The example using extern blocks requires unsafe, anyways. However, this is safe:

// crate A: compiled with -C target-feature=-sse
pub extern"C" fn foo(x: f32) -> f32 { x }

// crate B: compiled with -C target-feature=+sse
// this is now safe Rust code:
assert_eq!(A::foo(42.0), 42.0) }; // UB

(replace extern"C" with any ABI, except extern"Rust", extern"rust-intrinsic", extern"platform-intrinsic", or extern"rust-call").

chorman0773 commented 3 years ago

My argument here is that the ABI problem shouldn't be an issue, the problem is the fact you can cross target_feature domains safely. In any case that can currently be done with safe code, the compiler should be able to adjust ABI for all versions. In the cases it cannot, you need unsafe anyways and thus are responsible for the invariants.

The only issue with my implementation choice is fn-pointer types, but, again, this is an issue with any ABI that rust implementations cannot control, and still can be used by safe code.

workingjubilee commented 3 years ago

That function does not compile when targeting the x86_64 platform.

error: <source>:2:42: in function _ZN7example3foo17hd3a315f1768df74cE float (float): SSE register return with SSE disabled

error: aborting due to previous error

If it did, it would actually violate the SystemV AMD64 C ABI, and also would violate the MSVC x64 ABI, because both of those specify that you are supposed to pass float arguments using xmm registers.

Please limit your concerns to ones that can be actualized, though if you have an example that does, I am obviously very interested.

chorman0773 commented 3 years ago

Huh (nice, llvm being smart), though would that not have applied originally to extern"Rust"? In any case, my point about crossing target_feature bounderies still applies, though. Is it not UB to enter code compiled for one target feature when that feature is not available at runtime?

chorman0773 commented 3 years ago

Update, I was able to compile a simple case in #![no_std] for the i586-unknown-linux-gnu target, with -C target-cpu=core2 and -C target-feature=-x87 in the caller crate.

The build line was RUSTFLAGS="-C target-cpu=core2" cargo rustc -Z build-std=core,compiler_builtins --target i586-unknown-linux-gnu -- -C target-feature=-x87, building bar. The caller crate, bar is:

#![no_std]
fn main() {
    foo::foo(5.0f32);
}

and the callee crate, foo, is:

#![no_std]
// Type your code here, or load an example.
pub extern"C" fn foo(f: f32) -> f32{
    f
}

From Compiler Explorer https://godbolt.org/z/75nKsaWcx, the abi of extern"C" foo differs between -C target-feature=-x87 and -C target-feature=+x87 (default).

workingjubilee commented 3 years ago

Well, the concern which uses the incompatible FPU example is on the i586 platforms that aren't +sse by default, yes, so you don't need to disable it there. So perhaps I should really be saying, "please be more precise."

Much of the solution is indeed to teach the compiler to adjust automatically during compilation where it makes sense to do that. It is not a tidy or simple feat, however, or someone would have already done it. And outlining "where it makes sense to do that" is the first step.

chorman0773 commented 3 years ago

I've fixed the example after consulting the Sys-V Intel-386 psABI document here

rust-lang / rust

-C target-feature/-C target-cpu are unsound #64609