rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
97.96k stars 12.69k forks source link

simd_insert and simd_extract allow garbage data #77477

Closed workingjubilee closed 8 months ago

workingjubilee commented 4 years ago

It appears that with simd_insert and simd_extract that I can produce garbage data in a way that is probably due to unsound OOB memory access. These are unsafe functions but the related simd_shuffle functions fail to monomorphize. Miri provokes an ICE. I thiiiink simd_extract and simd_insert might not require const arguments on purpose, but I believe something may need to be patched re: Miri. cc @RalfJung

I was in the middle of constructing tests for rustc's simd intrinsics. I tried this code: https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=dfc24d97ffa77e6fbd4a65c16b713cf9

#![allow(non_camel_case_types)]
#![feature(repr_simd, platform_intrinsics)]

#[repr(simd)]
#[derive(Copy, Clone, Debug)]
struct f32x4(f32, f32, f32, f32);

extern "platform-intrinsic" {
    pub fn simd_insert<T, E>(x: T, idx: u32, y: E) -> T;
    pub fn simd_extract<T, E>(x: T, idx: u32) -> E;
}

fn main() {
    let x = f32x4(-1.0, 0.0, f32::INFINITY, f32::NAN);
    unsafe {
        let ins: f32x4 = simd_insert(x, 5, f32::NEG_INFINITY);
        let ext: f32 = simd_extract(x, 9);
        println!("{:?}", x);   // f32x4(-1.0, 0.0, inf, NaN)
        println!("{:?}", ins); // f32x4(0.000000000000000000000000000000000000000000001, 0.0,
                               // 12499248000000000.0, 0.000000000000000000000000000000000000000045915)
        println!("{}", ext);   // 0.000000000000000000000000000000000000000030658
    }
}

I (perhaps overly naively) expected to see this happen: "failure to monomorphize because blah blah blah" Instead, this happened: I got some totally wild garbage data!

rustc --version:

rustc 1.48.0-nightly (ef663a8a4 2020-09-30) running on x86_64-unknown-linux-gnu

The Miri ICE:

thread 'rustc' panicked at 'Index `5` must be in bounds of vector type `f32`: `[0, 4)`', /rustc/ef663a8a48ea6b98b43cbfaefd99316b36b16825/compiler/rustc_mir/src/interpret/intrinsics.rs:393:17
tesuji commented 4 years ago

@rustbot modify labels: requires-nightly

RalfJung commented 4 years ago

Yeah the Miri ICE should definitely be fixed.

I guess the question is whether these intrinsics should fail to monomorphize, or whether using them with OOB indices is UB. I assume the latter, but it would be good to get someone to confirm... who would know about SIMD stuff?

Also, this UB should be added to the docs for those intrinsics, probably in stdarch.

bjorn3 commented 4 years ago

simd_shuffle checks if the indices are in bound: https://github.com/rust-lang/rust/blob/6f56fbdc1c58992a9db630f5cd2ba9882d32e84b/compiler/rustc_codegen_llvm/src/intrinsic.rs#L882-L902

The simd_insert and simd_extract intrinsics are codegened at https://github.com/rust-lang/rust/blob/6f56fbdc1c58992a9db630f5cd2ba9882d32e84b/compiler/rustc_codegen_llvm/src/intrinsic.rs#L915-L938

bjorn3 commented 4 years ago

@rustbot modify labels: +A-simd +A-codegen +T-compiler

RalfJung commented 4 years ago

The insert and extract intrinsics do not even have rustc_args_required_cons, so they cannot be checked at monmorphization time. Thus UB is likely the only option.

bjorn3 commented 4 years ago

That is just an oversight in stdarch. All users of it are constant. (either fixed or using the constify family of macros)

RalfJung commented 4 years ago

Well, but codegen does not seem to rely on them being constants either, so why would we require that?

bjorn3 commented 4 years ago

While LLVM allows variable indexes, it will generate way more efficient code when indexes are known at compile time. https://godbolt.org/z/zvorEa Other codegen backends, like cg_clif, may also not allow variable indexes.

RalfJung commented 4 years ago

Sure, having more efficient code when some things are statically known is expected.

Other codegen backends, like cg_clif, may also not allow variable indexes.

I guess this is a question for one or several of the Rust teams then, whether it is reasonable to restrict these intrinsics to compile-time known indices even though supporting run-time indices is possible (and doesn't seem too hard, judging from what LLVM generates).

We can either

Cc @rust-lang/project-portable-simd -- this is not really about portable SIMD but hopefully still reaches the right people.

bjorn3 commented 4 years ago

add rustc_args_required_cons as well as post-monomorphization bounds checks (similar to the shuffle intrinsics), or

rustc_args_required_const is only applied to the extern "platform-intrinsic" definition. It doesn't require any change in the compiler. This means that it is perfectly fine for stdarch to use it and stdsimd to not use it for example. If we do choose to always require it to be const in the compiler itself, it would be possible to change the post-monomorphization error to an error in the intrinsic checker.

RalfJung commented 4 years ago

It doesn't require changes to the compiler but, AFAIK, it is only usually added when the compiler requires these constants, and actively exploits that for type checking and/or codegen.

nagisa commented 4 years ago

I don't see a problem with restricting the intrinsics to constant indices now and implementing the necessary code to verify the indices are in bounds. GCC for example has a similar restriction. Once there's an actual known use-case for non-constant indices in these operations we could consider relaxing the operation (while also implementing bound checking similar to one we do when indexing into slices today).

RalfJung commented 4 years ago

@jyn514 why did you mark this as I-unsound? Many intrinsics are unsafe to use, that does not make them unsound.

@nagisa

Once there's an actual known use-case for non-constant indices in these operations we could consider relaxing the operation (while also implementing bound checking similar to one we do when indexing into slices today).

I'd expect the intrinsic to be unchecked, and OOB indexing to be UB -- that is also the case, on the MIR level, with slice indexing today. Bounds checks are added during MIR construction.

jyn514 commented 4 years ago

Sorry, I saw 'unsound' in the message description and wasn't thinking.

bjorn3 commented 4 years ago

None of the platform-intrinsics are fundamentally unsafe to use. Safe intrinsics just didn't exist when they were introduced. I think the original plan was even to directly expose all platform-intrinsics to the user. There are several tests that invalid usage of them give nice compilation errors.

camelid commented 4 years ago

Assigning P-medium and removing I-prioritize as discussed in the prioritization working group.

RalfJung commented 4 years ago

None of the platform-intrinsics are fundamentally unsafe to use. Safe intrinsics just didn't exist when they were introduced. I think the original plan was even to directly expose all platform-intrinsics to the user. There are several tests that invalid usage of them give nice compilation errors.

Not for these two though it seems...

And indeed I don't think there is precedent for having such an intrinsic be checked. Instead, what we usually do is expose a safe function to the user which first does the check and then calls the unsafe intrinsic.

AFAIK none of the other SIMD intrinsics have any reasonable chance of causing UB (they all just operate on pure values), except for the shuffle intrinsics -- which however require statically known indices to even perform code generation. So that does not tell us anything about the intended behavior of simd_insert and sim_extract. Or are there other intrinsics that could cause UB but have some checks applied to them to avoid that?

workingjubilee commented 4 years ago

I think this is Rust unsoundness and not just unsound calling code if the intent is that it is not supposed to break in this particular manner.

Cc @rust-lang/project-portable-simd -- this is not rally about portable SIMD but hopefully still reaches the right people.

:telephone_receiver: Hello! Did you know Arm assigns very different meanings in terms of operations to "insert" and "extract" as concepts? Arm names these intrinsics as "vset" and "vget", with "vext" being more like a shuffle or interleaving operation, but the LLVM intrinsics are based on the Intel conceptualization (which Arm does give an honorable mention to in their documentation), so I actually took a long moment to figure out which one was in use because I had been reading about Arm for the entire past ~2 weeks, and frankly Neon makes more sense to me than SSE, so far.

Right, where were we? I believe simd_extract and simd_insert as implemented by the Rust compiler are intended to mimic simd_shuffleN in all regards on this matter, because the intrinsics that these are expected to compile to do require constants, and so it is unexpected behavior to compile to the dynamic extraction.

RalfJung commented 4 years ago

Well, looks like the experts agree they should be constant. Fine for me, I was just trying to help explore the design space. :)

So looks like the fix here is to add rustc_args_required_const to these functions and add compile-time checks similar to the shuffle intrinsics? (I am using the names rustc uses here as I know basically nothing about the assembly instructions these compile to.)

programmerjake commented 4 years ago

I'll note that LLVM's language reference states that the element index is specifically allowed to be a variable: https://llvm.org/docs/LangRef.html#extractelement-instruction

The first operand of an ‘extractelement’ instruction is a value of vector type. The second operand is an index indicating the position from which to extract the element. The index may be a variable of any integer type.

Lokathor commented 4 years ago

Yeah, likely the codegen will just be different in the case of a variable index is all.

RalfJung commented 4 years ago

https://github.com/rust-lang/rust/issues/70271 is somewhat related.

RalfJung commented 3 years ago

rustc_args_required_const is gone, but of course we can still make these intrinsics require constants as arguments like we do for simd_shuffle.

RalfJung commented 3 years ago

However, stdarch relies quite heavily on being able to pass non-const values to simd_extract in this macro used to generate the {u,i}NxM types.

So we'd have to rearrange things quite a bit if we wanted to enforce the simd_extract/simd_insert arguments to be a constant.

RalfJung commented 3 years ago

There are also 17 functions like this

pub unsafe fn _mm256_extract_epi64<const INDEX: i32>(a: __m256i) -> i64 {
    static_assert_imm2!(INDEX);
    simd_extract(a.as_i64x4(), INDEX as u32)
}

Due to the cast, this isn't just "forwarding" the const generic parameter, so we'd either need const_evaluatable_checked or again use an associated-const-based trick.

RalfJung commented 3 years ago

I don't think we should enable an incomplete feature like const_evaluatable_checked for this.

So how do people here feel about using a macro like this also for simd_extract/simd_insert in stdarch? Currently that seems to be the most realistic way forward to ensuring that these arguments are compile-time constants. Cc @Amanieu

Amanieu commented 3 years ago

I feel that these changes are quite intrusive and I would rather avoid them if possible. Could we instead force a const-evaluation on the compiler side where necessary?

AFAIK platform intrinsics are not meant to be directly exposed to user, so this shouldn't be a big issue.

RalfJung commented 3 years ago

Could we instead force a const-evaluation on the compiler side where necessary?

How'd that be different from promotion (which we just -- finally -- got rid of in this context)?

RalfJung commented 3 years ago

Longer-term (when const_evaluatable_checked becomes more stable, or alternatives arise), we could probably do something like the "legacy const arg" attribute for intrinsics, and rewrite simd_extract(x, N) to simd_extract::<N>(x).

Though given that this is an internal-only API, at that point it might make more sense to just change the code to simd_extract::<N>(x).

Amanieu commented 3 years ago

It feels somewhat silly to me that the compiler supports arbitrary constant expressions if you wrap them in a complicated dance of associated constants but not if you just write them directly in-line. Would using inline consts work here?

RalfJung commented 3 years ago

Oh, that part.

Inline consts are supposed to be able to use generics from the environment some day, yes. Different people disagree about whether that should be subject to const_evaluatable_checked constraints or not.

RalfJung commented 3 years ago

Also, FWIW, even if inline consts supported using generics, that would still not be enough any more once simd_insert/simd_extract themselves were ported to use a const generic: one cannot even use associated consts as const generic paramaters (as opposed to doing something like simd_shuffle where the argument merely has to be a const item, so associated consts are allowed).

Lokathor commented 3 years ago

is this "can't" a "you can't do it yet" or a "you can't ever do it, it's theoretically impossible"

RalfJung commented 3 years ago

It's certainly a "can't do it yet"; no idea what the long-term plans of @rust-lang/project-const-generics are here.

RalfJung commented 3 years ago

But all this means is that until this is resolved, we cannot make the simd_ intrinsics use const-generics. That's really more a topic for https://github.com/rust-lang/rust/issues/85229.

For this issue, we probably should focus on making simd_insert/simd_extract more like simd_shuffle. So, no const generics, but forcing the argument to be a const item. This means associated constants work, so we can use tricks like this -- but using inline consts here will require support for generics (which the original RFC excluded).

RalfJung commented 8 months ago

It turns out that simd_extract is used by stdarch with non-constant indices:

https://github.com/rust-lang/stdarch/blob/205b3a1de4f1624a42cd6557d96dfe6ab6f0c2e0/crates/core_arch/src/powerpc/altivec.rs#L3618-L3625

    #[simd_test(enable = "altivec")]
    unsafe fn test_vec_lde_u16() {
        let pat = [u16x8::new(0, 1, 2, 3, 4, 5, 6, 7)];
        for off in 0..8 {
            let v: u16x8 = transmute(vec_lde(off * 2, pat.as_ptr() as *const u8));
            assert_eq!(off as u16, v.extract(off as _));
        }
    }

v.extract here is just a wrapper around simd_extract.

@Amanieu @workingjubilee what shall we do with that?

calebzulawski commented 8 months ago

Does std::arch have internal functions for casting to arrays? That seems like a reasonable alternative, especially in a test.

RalfJung commented 8 months ago

I haven't seen such functions. For now I've changed it to use ptr arithmetic.