Open workingjubilee opened 4 years ago
In Zulip, I said:
So,
simd_extract
andsimd_insert
allow you to mess with individual values in a vector. If the index is known constantly, this is faster / better codegen, given that many architectures cough x86 cough have instructions that prefer immediates here. Per our approach to shuffles, we have established a generic desire to want to allow developers to be confident they are setting const values when this matters, and thus to expose const vs. dynamic APIs.I kind of want to use
fn get
andfn set
as the methods to do this for our vector types, becausethat's the term Arm NEON uses"extract" is a bit ambiguous with some possible shuffles we might want to offer and overall I think it's a bit cleaner than "insert", but this admittedly results in#[test] fn simple_insert() { let a = SimdF32::from_array([-3.0, 5.0, 2.0, -0.0]); let a = a.set::<2>(9.1e3); assert_eq!(a.to_array(), [-3.0, 5.0, 9.1e3, -0.0]); }
which is kind of an unexpected pattern.
There was some contention that this might not be needed and normal indexing operations would be fine. I would like codegen tests that prove that, one way or another. I am also skeptical this is going to be true of all codegen backends, but I could be persuaded that even the most naïve const folding algorithm (and Rust doesn't really work without at least some) would succeed.
There was some discussion over the weirdness of this API, but we decided "it looks weird and that's good". You should not be overusing single-element manipulations in a SIMD context, that kind of is against the whole point. So the only real weight against it is the codegen question.
I definitely think the concerns around constant vs dynamic apis hold here too, especially given that normal indexing will require way more work for the compiler to get rid of (because we have to hand out a &mut Elem
it needs to eliminate all the code spilling the value to the stack, taking the address of a sub-part, the bounds check, etc).
That said, the situation isn't nearly as dire as with shuffles, and this stuff is likely not common in really hot simd code (since it's dealing with individual values.
As to the color of the bikeshed, personally, I guess I'd prefer insert
to set
since it's not mutation, but #[must_use]
probably avoids most concerns.
My hidden talent is breaking APIs! It's awesome. I discovered problems in simd_insert and simd_extract that I documented in https://github.com/rust-lang/rust/issues/77477! I should probably be the one to link those intrinsics since I now also know the most about the dangers of them, but in the event I can't get around to it, anyone who implements them should be aware of that bug's status! Isn't finding bugs fun?