Open RalfJung opened 9 months ago
For a hard error, I think it probably should go through RFC. I can start drafting that now. As for what it would check, I'd expect a hard error on any time a simd type is passed by value and the appropriate feature is unavailable, even if the specific ABI wouldn't necessarily pass it in registers, including inside a structure, and the lint should probably check the same thing.
That can have potentially surprising effects, e.g. when a closure captures something of type __m256
and the closure gets passed by-value to a generic function.
It also means adding adding a private field that contains by-val repr(simd)
data can be a semver-breaking change. Cc @ehuss
It also means adding adding a private field that contains by-val repr(simd) data can be a semver-breaking change
This would already be the case with replacing several fields with a SIMD type. For example, changing struct Foo(u64,u64,u64,u64);
to struct Foo(__m256i);
Although thinking about it, because features can be per-crate, this might be a problem on crates that specialize based on target features.
Perhaps #[repr(Rust)]
types should be exempted and always passed indirectly when they contain SIMD types (but not #[repr(C)]
types).
With RISC-V, you have Zfinx
and Zdinx
which have hard-floats but have moved the floats to the normal general purpose registers. In this case, forbidding the passing of floats by-value, if the f
(RISC-V's version of single-precision float) target is not present, is very restrictive and probably not what you would want. Otherwise, allowing the passing-by-value of floats between f
and Zfinx
is also UB.
But the ABI only changes when f
is present or not present, I assume? When f
is present, Zfinx
just means more instructions can be used but the ABI still says f32
is passed on float registers? And when f
is not present, softfloat and Zfinx
use the same ABI?
So that sounds like whether or not f
is present should be a different target triple. Code with vs without f
simply cannot be linked together. But having or not having Zfinx
doesn't make an ABI difference. Or am I misunderstanding?
No, that is exactly it. Do I understand correctly that you essentially want to do something similar to thumb
with the eabi
and eabihf
?
I'm not an expert on target details. :see_no_evil: Usually my purview ends when I have given precise abstract semantics to nasty pieces of unsafe Rust and MIR.^^ But every now and then terrible low-level CPU nightmares creep onto my territory and then I have to learn how to deal with them...
So, I don't know what ARM does with eabi vs eabihf. But it sounds like they are considering "hard float" and "soft float" two different ABIs that presumably use different registers for argument passing (so they are indeed different ABIs), and then have corresponding target triples to tell apart the ABIs. So yeah that sounds like what I am suggesting we should do for other targets as well. (We could also consider different ABI strings, like "C" vs "C-softfloat" or so. No idea if that makes any sense.)
Obviously we don't actually care whether the functions behind those ABIs use x87/d/f hardware instructions or do some softfloat stuff, we just care how arguments are passed. But sadly it seems that in LLVM, there's no way to say "you have the d,f
target features available but please use the softfloat ABI anyway"? If that is something people need (enable d,f
or x87,sse,sse2
on softfloat targets without affecting ABI) we should start talking with LLVM people about whether and how that could be achieved.
This may all be completely naive since I'm just coming at this with my principled approach of making everything safe or at least sound and giving it semantics at a high level; I rely on people like @chorman0773 to tell me when my ideas make no sense "in the field". ;)
If with_target_feature
calling no_target_feature
is UB due do different target features, shouldn't main
calling with_target_feature
have the same problem? just the other way around?
Or is this specific to "C" being a non Rust-ABI?
Yeah for the Rust ABI we do apply a work-around (it always passes such types in memory), so that call is fine. But the combination of extern fn
and target_feature
is a big minefield currently.
(I updated the example to actually use the Rust ABI for with_target_feature
.)
Would adding a inline(never)
Rust ABI intermediate function be a valid workaround to prevent UB here?
Unconditional code generation #[target_feature]: Note 1 making #[inline(never)]
required as otherwise the compiler may inline the function and then we would be back to the incorrect feature set.
Inlining should be aware of target features, so inline(never)
should not be needed. If it is needed that's a critical bug in the inliner that we should track and fix.
Yes such an intermediate Rust ABI function works around the problem successfully. But the fundamental problem remains with function pointers: we have to decide whether an extern "C" fn(__m256)
has the AVX ABI or some fallback ABI, without having any clue where and how the function it points to was declared. We probably want it to have the AVX ABI since people that use AVX and interface between Rust and C code likely will have their C code built in a way that it uses the AVX ABI. This means all extern "C" fn
declared in Rust that take __m256
as arguments should do this the AVX style, no matter the set of enabled target features -- or they should fail to build.
Ah, the inline
actually does make a difference here... well that's bad, I opened https://github.com/rust-lang/rust/issues/116573 to track that.
The compiler does report improper_ctypes_definitions
warnings for the given code so I guess it isn't unexpected? It would e better to have an example that doesn't trigger improper_ctypes_definitions
.
The compiler does report improper_ctypes_definitions warnings for the given code so I guess it isn't unexpected? It would e better to have an example that doesn't trigger improper_ctypes_definitions.
And that's supposed to be just a warning. We should hard error if we know everything is fucked and there's no salvaging the compilation (whether hard erroring is appropriate here, I don't know yet).
There's no C code involved here so I would argue it is reasonable to silence the lint for the example. The lint talks about FFI, but the example does no FFI, this is all Rust-to-Rust calls.
Is there an example that can trigger this with only FFI-safe types though?
I've actually suppressed that lint for __m128
on sse3 only code (so xmm registers 100% exist) that was experiencing a perf regression when the function used default ABI.
But also I wouldn't trust improper_ctypes{,_definitions}
to not get blanket suppressed in FFI heavy code, so at the very least a lint for this should be it's own thing. improper_ctypes is a clippy lint in rustc.
I don't think #[target_feature]
should ever affect the ABI of function pointers called from within them, especially not extern "C"
, ones where the C ABI is defined statically at the TU level.
If we did anything with ABI based on #[target_feature]
, it would have to be encoded in the type of the function pointers somehow. So for example, an extern "C"
function with an avx calling convention would really be extern "C+avx"
or similar.
@tmandry yeah agreed. Sadly target_feature 1.0 was stabilized in a form where #[target_feature]
does affect ABI, and core::arch
SIMD types were stabilized in a form where -C target-feature
affects ABI. That's causing this issue here and also https://github.com/rust-lang/rust/issues/116344.
@briansmith our ctypes lints seem to have enough gaps to be easily circumvented; here's a reproducer that the lint doesn't catch. It's using a closure to pass the value -- closures that capture a single field are newtypes around that field so they get the field's ABI, but the improper-ctypes lint ignores closures.
More to the point, there's no formal definition of "FFI-safe" and "FFI-unsafe" in the Rust language. As far as I can tell, the actual thing that bears the obligation to be ABI-correct is when you do
extern "C" {
fn c_func(arg: SomeType) -> ReturnType;
}
fn rust_func() -> ReturnType {
let arg = SomeType::random();
// here, the Rust caller actually discharges the obligation...
// which is honestly slightly unfortunate:
// in real scenarios, we might not be the author of the `extern "C"` block!
unsafe { c_func(arg) }
}
Do we know if clang is doing anything clever here? I assume C has exactly the same problem, since their function types don't track target features either.
clang does not fix this issue. See llvm-project#64706
That's a different issue (mismatch with GCC). This here is about mismatch within code compiled by the same compiler, but with different locally set target features.
Ah right. It does, in fact, mismatch with itself without the attribute.
(It also mismatches between -mavx
and -mno-avx/__attribute__((target("avx")))
)
Do you happen to have a godbolt link demonstrating that? :)
https://godbolt.org/z/obhezje4n
Interpreting the assembly: In the global (-mavx
) case, it passes __m256
in ymm
registers, and returns them in ymm0
. For the inactive case, it passes on the stack and returns in xmm0:xmm1
. For the locally-enabled only case, it passes on the stack, and returns in ymm0
(according to Sys-V, there shouldn't be a difference with the globally enabled case).
Edit: I've added gcc's output for context. GCC has the correct output according to the ABI. Edit: For other x86 assembler readers... yes gcc is using an integer simd load and an floating-point simd store. Yes it is correct. Yes it is mildly infuriating.
Turns out that clang is able to detect and warn at least some of these cases. Clang also allows putting __attribute__((target("avx")))
annotations at function declarations and uses them to decide whether or not to warn/error. In contrast, we allow #[target_feature]
only on definitions, not on imports (extern "C" {}
blocks).
One way to resolve this problem would be a new post-monomorphization check: traverse the argument (and return) types of the function being built (if it's a non-Rust ABI), as well as the argument and return types of every non-Rust-ABI function signature we call. If we find any repr(simd)
types in there that's not behind a pointer indirection, ensure that vectors of that size are supported by the available target features. I don't know if there's any good cross-target way to do this; if not, we'll need a big match on the target architecture and hard-code which features need to be enabled for which vector size.
This is stricter than what clang does, but it's also safer, so if we can get away with it I think this would be my preferred approach. We don't have a good central place for post-mono checks, so we'll need to figure out what is the best way to share this between the codegen backends and Miri. Also, such a check would likely be optimization-dependent, since if an ABI-incompatible function call gets optimized away we wouldn't see it post-mono. We have to do the check post-mono though since we support generic extern "C"
fn and there is no trait for "does not contain a SIMD type by-value" (and it would be a breaking change to start requiring such a trait).
(I think this is what @chorman0773 said they are planning to do in their compiler.)
Yep (in my case, right down at the codegen level - codegen-x86 would reject the function when finding parameter locations). I actually have an RFC-being-written for prescribing this behaviour at the rust level.
I actually have an RFC-being-written for prescribing this behaviour at the rust level.
Great!
Given that we currently do rather nonsensical things at the ABI level, I was going to suggest we should just implement a band-aid without waiting for an RFC. But that doesn't preclude designing something more proper. :)
One note is this text that I put
When a function is called via an fn-item type (not a function pointer type), the target features of the callee are considered. [...] Implementations of Rust would be required to ensure this call is performed properly (with parameters passed/returned in the appropriate registers), which may entail the use of a shim function or alternative code generation techniques.
This may require some extra work with llvm, or on the rustc side.
Interesting, you seem to plan to accept more code than I would. Anyway we can discuss the RFC once it is ready. :)
(Actually, I could probably open it in a less formal capacity - I think there is some pre-discussion to be had before filing a full RFC)
The following program has UB, for very surprising reasons:
The reason is that the ABI of the
__m256
type depends on the set of target features, so the caller (with_target_feature
) and callee (no_target_feature
) do not agree on how the argument should be passed. The result is a vector half-filled with junk. (The same issue also arises in the other direction, where the caller has fewer features than the callee: example here.)I'm not sure if this is currently even properly documented? We are mentioning it in https://github.com/rust-lang/rust/pull/115476 but the program above doesn't involve any function pointers so we really cannot expect people to be aware of that part of the docs. We show a general warning about the type not being FFI-compatible, but that warning shows up a lot and anyway in this case both caller and callee are Rust functions!
I think we need to do better here, but backwards compatibility might make that hard. @chorman0773 suggested we should just reject functions like
no_target_feature
that take an AVX type by-val without having declared the AVX feature. That seems reasonable; a crater run would be needed to assess whether it breaks too much code. An alternative might be to have a deny-by-default lint that very clearly explains what is happening.There are also details to work out wrt what exactly the lint should check. Newtypes around
__m256
will have the same problem. What about other larger types that contain__m256
? Behind a ptr indirection it's obviously fine, but what about(__m256, __m256)
? If we apply theScalarPair
optimization this will be passed in registers even on x86.A possible place to put the check could be somewhere around here.
In terms of process, I am not sure if an RFC is required; a t-compiler MCP might be sufficient. Currently we accept code that clearly doesn't do what it looks like it should do.
And finally -- are there any other targets (besides x86 and x86-64) that have target-features that affect the ABI? They should get the same treatment.
Note that this is different from https://github.com/rust-lang/rust/issues/116344 in two ways:
__m256
by-pointer exactly to work-around this issue.)#[target_feature]
), the other issue comes about when the user disables features (which is only possible on a per-crate level via-C
, and if you're mixing crates with different-C
flags then you're already already on very shaky grounds -- we do that with std but nobody else really gets to do that, I think).Cc @workingjubilee more ABI fun ;)