Closed workingjubilee closed 8 months ago
@rustbot modify labels: requires-nightly
Yeah the Miri ICE should definitely be fixed.
I guess the question is whether these intrinsics should fail to monomorphize, or whether using them with OOB indices is UB. I assume the latter, but it would be good to get someone to confirm... who would know about SIMD stuff?
Also, this UB should be added to the docs for those intrinsics, probably in stdarch.
simd_shuffle
checks if the indices are in bound: https://github.com/rust-lang/rust/blob/6f56fbdc1c58992a9db630f5cd2ba9882d32e84b/compiler/rustc_codegen_llvm/src/intrinsic.rs#L882-L902
The simd_insert
and simd_extract
intrinsics are codegened at https://github.com/rust-lang/rust/blob/6f56fbdc1c58992a9db630f5cd2ba9882d32e84b/compiler/rustc_codegen_llvm/src/intrinsic.rs#L915-L938
@rustbot modify labels: +A-simd +A-codegen +T-compiler
The insert and extract intrinsics do not even have rustc_args_required_cons
, so they cannot be checked at monmorphization time. Thus UB is likely the only option.
That is just an oversight in stdarch. All users of it are constant. (either fixed or using the constify family of macros)
Well, but codegen does not seem to rely on them being constants either, so why would we require that?
While LLVM allows variable indexes, it will generate way more efficient code when indexes are known at compile time. https://godbolt.org/z/zvorEa Other codegen backends, like cg_clif, may also not allow variable indexes.
Sure, having more efficient code when some things are statically known is expected.
Other codegen backends, like cg_clif, may also not allow variable indexes.
I guess this is a question for one or several of the Rust teams then, whether it is reasonable to restrict these intrinsics to compile-time known indices even though supporting run-time indices is possible (and doesn't seem too hard, judging from what LLVM generates).
We can either
rustc_args_required_cons
as well as post-monomorphization bounds checks (similar to the shuffle intrinsics), orCc @rust-lang/project-portable-simd -- this is not really about portable SIMD but hopefully still reaches the right people.
add rustc_args_required_cons as well as post-monomorphization bounds checks (similar to the shuffle intrinsics), or
rustc_args_required_const
is only applied to the extern "platform-intrinsic"
definition. It doesn't require any change in the compiler. This means that it is perfectly fine for stdarch to use it and stdsimd to not use it for example. If we do choose to always require it to be const in the compiler itself, it would be possible to change the post-monomorphization error to an error in the intrinsic checker.
It doesn't require changes to the compiler but, AFAIK, it is only usually added when the compiler requires these constants, and actively exploits that for type checking and/or codegen.
I don't see a problem with restricting the intrinsics to constant indices now and implementing the necessary code to verify the indices are in bounds. GCC for example has a similar restriction. Once there's an actual known use-case for non-constant indices in these operations we could consider relaxing the operation (while also implementing bound checking similar to one we do when indexing into slices today).
@jyn514 why did you mark this as I-unsound? Many intrinsics are unsafe to use, that does not make them unsound.
@nagisa
Once there's an actual known use-case for non-constant indices in these operations we could consider relaxing the operation (while also implementing bound checking similar to one we do when indexing into slices today).
I'd expect the intrinsic to be unchecked, and OOB indexing to be UB -- that is also the case, on the MIR level, with slice indexing today. Bounds checks are added during MIR construction.
Sorry, I saw 'unsound' in the message description and wasn't thinking.
None of the platform-intrinsics
are fundamentally unsafe to use. Safe intrinsics just didn't exist when they were introduced. I think the original plan was even to directly expose all platform-intrinsics
to the user. There are several tests that invalid usage of them give nice compilation errors.
Assigning P-medium
and removing I-prioritize
as discussed in the prioritization working group.
None of the platform-intrinsics are fundamentally unsafe to use. Safe intrinsics just didn't exist when they were introduced. I think the original plan was even to directly expose all platform-intrinsics to the user. There are several tests that invalid usage of them give nice compilation errors.
Not for these two though it seems...
And indeed I don't think there is precedent for having such an intrinsic be checked. Instead, what we usually do is expose a safe function to the user which first does the check and then calls the unsafe intrinsic.
AFAIK none of the other SIMD intrinsics have any reasonable chance of causing UB (they all just operate on pure values), except for the shuffle intrinsics -- which however require statically known indices to even perform code generation. So that does not tell us anything about the intended behavior of simd_insert and sim_extract. Or are there other intrinsics that could cause UB but have some checks applied to them to avoid that?
I think this is Rust unsoundness and not just unsound calling code if the intent is that it is not supposed to break in this particular manner.
Cc @rust-lang/project-portable-simd -- this is not rally about portable SIMD but hopefully still reaches the right people.
:telephone_receiver: Hello! Did you know Arm assigns very different meanings in terms of operations to "insert" and "extract" as concepts? Arm names these intrinsics as "vset" and "vget", with "vext" being more like a shuffle or interleaving operation, but the LLVM intrinsics are based on the Intel conceptualization (which Arm does give an honorable mention to in their documentation), so I actually took a long moment to figure out which one was in use because I had been reading about Arm for the entire past ~2 weeks, and frankly Neon makes more sense to me than SSE, so far.
Right, where were we? I believe simd_extract
and simd_insert
as implemented by the Rust compiler are intended to mimic simd_shuffleN
in all regards on this matter, because the intrinsics that these are expected to compile to do require constants, and so it is unexpected behavior to compile to the dynamic extraction.
Well, looks like the experts agree they should be constant. Fine for me, I was just trying to help explore the design space. :)
So looks like the fix here is to add rustc_args_required_const
to these functions and add compile-time checks similar to the shuffle intrinsics? (I am using the names rustc uses here as I know basically nothing about the assembly instructions these compile to.)
I'll note that LLVM's language reference states that the element index is specifically allowed to be a variable: https://llvm.org/docs/LangRef.html#extractelement-instruction
The first operand of an ‘extractelement’ instruction is a value of vector type. The second operand is an index indicating the position from which to extract the element. The index may be a variable of any integer type.
Yeah, likely the codegen will just be different in the case of a variable index is all.
https://github.com/rust-lang/rust/issues/70271 is somewhat related.
rustc_args_required_const
is gone, but of course we can still make these intrinsics require constants as arguments like we do for simd_shuffle
.
However, stdarch relies quite heavily on being able to pass non-const
values to simd_extract
in this macro used to generate the {u,i}NxM
types.
So we'd have to rearrange things quite a bit if we wanted to enforce the simd_extract
/simd_insert
arguments to be a constant.
There are also 17 functions like this
pub unsafe fn _mm256_extract_epi64<const INDEX: i32>(a: __m256i) -> i64 {
static_assert_imm2!(INDEX);
simd_extract(a.as_i64x4(), INDEX as u32)
}
Due to the cast, this isn't just "forwarding" the const generic parameter, so we'd either need const_evaluatable_checked
or again use an associated-const-based trick.
I don't think we should enable an incomplete feature like const_evaluatable_checked
for this.
So how do people here feel about using a macro like this also for simd_extract
/simd_insert
in stdarch? Currently that seems to be the most realistic way forward to ensuring that these arguments are compile-time constants. Cc @Amanieu
I feel that these changes are quite intrusive and I would rather avoid them if possible. Could we instead force a const-evaluation on the compiler side where necessary?
AFAIK platform intrinsics are not meant to be directly exposed to user, so this shouldn't be a big issue.
Could we instead force a const-evaluation on the compiler side where necessary?
How'd that be different from promotion (which we just -- finally -- got rid of in this context)?
Longer-term (when const_evaluatable_checked
becomes more stable, or alternatives arise), we could probably do something like the "legacy const arg" attribute for intrinsics, and rewrite simd_extract(x, N)
to simd_extract::<N>(x)
.
Though given that this is an internal-only API, at that point it might make more sense to just change the code to simd_extract::<N>(x)
.
It feels somewhat silly to me that the compiler supports arbitrary constant expressions if you wrap them in a complicated dance of associated constants but not if you just write them directly in-line. Would using inline consts work here?
Oh, that part.
Inline consts are supposed to be able to use generics from the environment some day, yes. Different people disagree about whether that should be subject to const_evaluatable_checked
constraints or not.
Also, FWIW, even if inline consts supported using generics, that would still not be enough any more once simd_insert
/simd_extract
themselves were ported to use a const generic: one cannot even use associated consts as const generic paramaters (as opposed to doing something like simd_shuffle
where the argument merely has to be a const
item, so associated consts are allowed).
is this "can't" a "you can't do it yet" or a "you can't ever do it, it's theoretically impossible"
It's certainly a "can't do it yet"; no idea what the long-term plans of @rust-lang/project-const-generics are here.
But all this means is that until this is resolved, we cannot make the simd_
intrinsics use const-generics. That's really more a topic for https://github.com/rust-lang/rust/issues/85229.
For this issue, we probably should focus on making simd_insert
/simd_extract
more like simd_shuffle
. So, no const generics, but forcing the argument to be a const
item. This means associated constants work, so we can use tricks like this -- but using inline consts here will require support for generics (which the original RFC excluded).
It turns out that simd_extract is used by stdarch with non-constant indices:
#[simd_test(enable = "altivec")]
unsafe fn test_vec_lde_u16() {
let pat = [u16x8::new(0, 1, 2, 3, 4, 5, 6, 7)];
for off in 0..8 {
let v: u16x8 = transmute(vec_lde(off * 2, pat.as_ptr() as *const u8));
assert_eq!(off as u16, v.extract(off as _));
}
}
v.extract
here is just a wrapper around simd_extract
.
@Amanieu @workingjubilee what shall we do with that?
Does std::arch have internal functions for casting to arrays? That seems like a reasonable alternative, especially in a test.
I haven't seen such functions. For now I've changed it to use ptr arithmetic.
It appears that with simd_insert and simd_extract that I can produce garbage data in a way that is probably due to unsound OOB memory access. These are unsafe functions but the related simd_shuffle functions fail to monomorphize. Miri provokes an ICE. I thiiiink simd_extract and simd_insert might not require const arguments on purpose, but I believe something may need to be patched re: Miri. cc @RalfJung
I was in the middle of constructing tests for rustc's simd intrinsics. I tried this code: https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=dfc24d97ffa77e6fbd4a65c16b713cf9
I (perhaps overly naively) expected to see this happen: "failure to monomorphize because blah blah blah" Instead, this happened: I got some totally wild garbage data!
rustc --version
:The Miri ICE: