rust-lang / portable-simd

The testing ground for the future of portable SIMD in Rust
Apache License 2.0
896 stars 80 forks source link

Scatter into `&mut [MaybeUninit<scalar>]`s instead of `&mut [scalar]` #205

Open fee1-dead opened 2 years ago

fee1-dead commented 2 years ago

Title. The current API only allows scattering to initialized buffers, but it shouldn't matter whether the buffer is actually initialized.

workingjubilee commented 2 years ago

This is a reasonable request but I can already see the "scatter should allow MyPersonalUnionType<T: SimdElement>" request that would be coming on this one's heels, and I am both not aware of a principled reason to refuse that but also not sure how to provide it at the moment. It could be done by exposing the Vector of Raw Pointers API, but that doesn't merely require a bikeshed to be painted. Rather, we are apparently commissioning an elaborate mural.

Hrm... Is this a safe function? It seems it has to be, as it fulfills the relevant validity invariant for an ambiguously-typed memory location, so we aren't accessing an invalid value, we're describing a valid value, and that is safe unless done through *mut T, which is only unsafe because it is dereferencing a raw pointer which doesn't know if it's obtained validly. Yes? So the gather equivalent may be unsafe, but the scatter would not be, although the current implementation wraps some very unsafe spooky actions.

Open to ideas. Maybe a BorrowMut constraint?

thomcc commented 2 years ago

We discussed this elsewhere, but to update this issue: MaybeUninit is a #[repr(transparent)] union (unlike anything from users), and is special in certain ways.

IMO we should support this for MaybeUninit<T>, and not for arbitrary user types or at least, the request to support it for user types would be a separate request (one which personally I'd probably be against).

workingjubilee commented 2 years ago

To elaborate a little, my current understanding: Scatter functionally is a vectorization of

unsafe { u.as_mut_ptr().write(v); }

which requires the union to be #[repr(transparent)] or possibly #[repr(C)], and preferably one with exactly equal size to the T in question. Anything validly writable would have to just be soundly transmutable to [MaybeUninit<T>], in essence. So that settles "why this and not something else": any type that could be valid to write to must establish it can be reinterpreted as one of these two types anyways.

Allowing it with anything that doesn't fulfill that requires us to either make the function unsafe or somehow vectorize

*u = Union::new(val);

which either is borderline impossible or ruins the point of a safe scatter, and either path seems like a fully-automatic footgun firing a million rounds per second. Instead, the burden of proving such a transmutation is sound should be fully on the type's creator and them upholding their invariants correctly, not the scatter function. Eventually, we will probably offer such an unsafe function with some kind of SimdPtr, but not today. Until then, programmers can probably just use a #[repr(C)] union with [MaybeUninit<T>; N] or their own transmutes to avail themselves of this function.

Now, to my mind, the only question is if we should even have the scatter_* methods on Simd, or if we should add these methods to [MaybeUninit<T>] directly, or express it as a bound (let's not, this time), or if we should allow irregularity, or what.

Lokathor commented 2 years ago

probably simd_val.scatter_to(&mut slice, simd_indexes), which ends up grouping the method logically with simd values, rather than slices or something where docs get lost.

calebzulawski commented 2 years ago

I'm confused where this discussion is going... You can just use MaybeUninit::slice_as_mut_ptr and scatter as usual. Once we support vectors of pointers even the user could do this, but we probably want to provide it for convenience.

workingjubilee commented 2 years ago

Uhh, scatter_select_unchecked doesn't allow using *mut T, just &mut [T], but I will take this as a vote in favor of changing the type signature to use such.

EDIT: I don't think this is actually a good idea though, having reviewed my own code.

calebzulawski commented 2 years ago

Yeah, I meant internally. It seemed to me like you were suggesting it's unsound to scatter to MaybeUninit. Might be worth seeing what happens with the function signatures when we have vectors of pointers since that will probably change things up anyway.

thomcc commented 2 years ago

Uhh, scatter_select_unchecked doesn't allow using *mut T, just &mut [T], but I will take this as a vote in favor of changing the type signature to use such.

Seconding this vote.

For the lowest level APIs unsafe APIs like this, using pointers is both more flexible and less error prone. Not just cases like this, but &mut [T] requires exclusive unique access to the whole memory range — however an scatter_select_unchecked based on *mut T, that restriction goes away (ofc, I still have to ensure nobody has a borrow out on any of the areas that will actually get written to, but there are a lot of situations where that's fine and the first would not be).

programmerjake commented 2 years ago

For the lowest level APIs unsafe APIs like this, using pointers is both more flexible and less error prone.

I'm hoping this will end up being the second (or third) lowest level API -- lowest level should be:

impl<T, const N: usize> Simd<*mut T, N> {
    // basically 1:1 with llvm intrinsic
    pub unsafe fn scatter_with_align<const ALIGN: usize>(self, values: Simd<T, N>, mask: Mask<T, N>) { ... }
}
calebzulawski commented 1 year ago

With #315 merged, scatter to MaybeUninit is supported by scattering to arbitrary pointers. Since &mut [MaybeUninit<T>] inherently requires some unsafe anyway, is this sufficient?