Open Kixiron opened 2 years ago
At first glance this seems useful, and we have discussed uninitialized lanes to some extent.
There's certainly no consensus on what element types should be valid, but I believe the most common position is that it should only be number types and pointers. We haven't formalized this, but that would effectively mean that SimdElement
implies all bit patterns are valid.
If that's the case, I think MaybeUninit
would serve little purpose. You could have uninitialized lanes, but I don't think there would be any performance benefit vs arbitrarily initialized lanes, since most (if not all? not sure about RISC V V) instruction sets operate on all lanes, regardless. You could still use vectors for enums, but the underlying type would be integers.
Pinging @programmerjake, our resident unusual element type proponent
Uninitialized data is a little more subtle than just assuming all bit patterns are valid, if that were the case then we'd never need MaybeUninit<u8>
for anything
imho having Simd<MaybeUninit<T>, N>
would be quite useful.
an example (check for matched parenthesis in read input bytes, accounting for shorter inputs):
const LANES: usize = 64;
// safe because the array element type MaybeUninit<u8> can be uninit
let mut buf: [MaybeUninit<u8>; LANES] = unsafe { MaybeUninit::uninit().assume_init() };
let res = unsafe { libc::read(fd, buf.as_mut_ptr().cast::<c_void>(), LANES) };
if res < 0 {
todo!("error handling");
}
let len = res as usize;
const INDEXES: [usize; LANES] = {
let mut retval = [0; LANES];
let mut i = 0;
while i < LANES {
retval[i] = i;
i += 1;
}
retval
};
let vec: Simd<MaybeUninit<u8>, LANES> = buf.into();
let lparens: Mask<MaybeUninit<i8>, LANES> = vec.lanes_eq(Simd::splat(MaybeUninit::new(b'(')));
let rparens: Mask<MaybeUninit<i8>, LANES> = vec.lanes_eq(Simd::splat(MaybeUninit::new(b')')));
// 1 for every `(`, `-1` for every `)`, `0` otherwise:
let byte_values: Simd<MaybeUninit<i8>, LANES> = rparens.to_int() - lparens.to_int();
let mask = Simd::from(INDEXES).lanes_lt(Simd::splat(len));
let byte_values = mask.select(byte_values, Simd::splat(MaybeUninit::new(0)));
let sum: MaybeUninit<u8> = byte_values.reduce_sum();
let sum = unsafe { sum.assume_init() };
assert_eq!(sum, 0, "mismatched parenthesis");
and yes, MaybeUninit
is needed if you want uninitialized vector elements because the compiler treats uninitialized as something different than any normal value you could put in T
:
e.g. (to be clear, this is UB because a
in f
is uninitialized):
https://play.rust-lang.org/?version=stable&mode=release&edition=2021&gist=c7ebaea680ecad3f5545cff19bc5ff61
use std::mem::MaybeUninit;
unsafe fn f(v: &mut u64, v2: &mut u64) {
let a: u64 = MaybeUninit::uninit().assume_init();
*v = a;
*v2 = a;
}
fn main() {
let mut a = 0;
let mut b = 1;
unsafe { f(&mut a, &mut b); }
assert_eq!(a, b);
}
currently prints:
thread 'main' panicked at 'assertion failed: `(left == right)`
left: `0`,
right: `1`', src/main.rs:13:5
which isn't something that could happen for any normal value of a
in f
.
@Kixiron What we're finding is that a lot of vector ISAs offer trivial zeroing for predicated loads (which Rust doesn't currently support, but we can add) and predicated gathers (which are mandatory), so while we're investigating this use-case with interest, we're struggling to come up with real examples that will profit a lot from this in terms of actual compilation results. Is there something we're missing here?
@workingjubilee
one other note: Simd<MaybeUninit<T>, N>
matches LLVM IR's internal semantics, so requiring zeroing everywhere instead of undef
will make it much harder for LLVM to optimize due to all the extra select
ops.
There's also no way to emulate things like ARM's svldff1
instruction without MaybeUninit
, meaning that things like this (Implementing 'strlen' using SVE, Blog Post) would be impossible to implement in Rust
While I agree that particular operation produces uninit lanes--that also doesn't look to me like anything that's particularly easy to implement portably (I'm not even sure if it's possible).
While I agree that particular operation produces uninit lanes--that also doesn't look to me like anything that's particularly easy to implement portably (I'm not even sure if it's possible).
the portable version would be something like:
pub unsafe fn strided_load_fault_first<T, const MAXVL: usize>(
ptr: *const T,
stride_in_bytes: isize, // can be negative
mask: Mask<T, MAXVL>,
mut vl: usize,
) -> (Simd<MaybeUninit<T>, MAXVL>, usize) {
let mut retval = Simd::<MaybeUninit<T>, MAXVL>::uninit();
assert!(vl <= MAXVL);
for i in 0..vl {
let element_ptr = ptr
.cast::<u8>()
.wrapping_offset(i * stride_in_bytes)
.cast::<MaybeUninit<T>>();
let mut stop = mask.test(i) && read_faults(element_ptr);
// nondeterministic() is always true on arches without fault first,
// avoiding the need for read_faults
stop |= nondeterministic();
stop &= i != 0;
if stop {
vl = i;
break;
}
if mask.test(i) {
retval[i] = ptr::read_unaligned(element_ptr);
}
}
(retval, vl)
}
We have to be very careful with exposing LLVM's poison-propagating arithmetic semantics. Right now the soundness of Rust's compilation to LLVM relies on the fact that a UB-free Rust program will never have an LLVM poison value anywhere ever. This is crucial for us since LLVM treats poison on a per-value level and Rust's initialization tracking is on a per-byte level, so if a poison ever happens it could "infect" neighboring bytes in ways that are unsound.
So, doing anything like this is unfortunately blocked on LLVM having a "byte" type bN
of arbitrary size that can represent bytewise poison semantics.
And that's leaving aside my own more subjective objections to poison-style arithmetic -- I could be convinced but I need solid motivating examples. Also see the Zulip discussion here.
Vectors of
MaybeUninit
values and operations over them (eg.Simd<MaybeUninit<T>, N> + Simd<MaybeUninit<T>, N>
) would be incredibly useful for more advanced struct of array operations. This is mostly inspired by how Futhark handles sum types and is something that I actually have a use case for. The basic premise is that we hold and perform operations over potentially uninit values but only keep the values that are defined. A more concrete case of this would be in a struct of arrayOption
typeThis allows efficiently operating over SoA formed enums, which would be amazing