Open calebzulawski opened 3 years ago
Feature gate:
#![feature(portable_simd)]
i'm sorry if this is the wrong place to ask but im rather new to rust and stumbled upon this issues as my compiler told me to
if i want to use this feature as soon as my compiler supports it can i gate it like:
#[cfg(feature = "portable_simd")]
use std::simd::Simd;
or is that only for feautures regarding my package (set in toml or passed to cargo?) if so what would be the appropriate way to use simd as soon as this issue is resolved?
The #![feature(portable_simd)]
part goes at the top of a binary or library.
It's a language feature not a cargo feature so it works a little differently.
It's unfortunate that they're both just "feature". Rust is often too terse when it counts.
ok thanks a lot!
just to make sure this means there is no (easy*) way to use this language feature if my compiler supports it and fall back to a custom implementation otherwise?
*easy as in compile time guards / attribute-like macros or creating a custom wrapper module that either provides rusts simd or my own fallback or something else in that level of skill
for anyone else stumbling upon this:
language features are (unstable) features you can opt-in when using nightly rust (by putting the specified flag in your library root, the whole project will then be compiled with a compiler that uses this feature)
@HannesGitH You might be able to do this with the rustversion
macro, which works similar to #[cfg]
. However, I haven't tested any of these myself.
For example:
#![rustversion::attr(nightly, feature(portable_simd))]
#[rustversion::nightly]
use std::simd::Simd;
I would recommend encapsulating all of your nightly-specific code in a module, so you only have to conditionally define the module itself and anything using it:
#[rustversion::nightly]
mod portable_simd;
pretty close to what i intented, thanks for letting me know, thats what i'll use 👍
only problem is that once this feature becomes stable and one would compile my code using the stable version (which would then include actual simd) the resulting binary would still use my simd workarounds
might be nice to have some kind of #[feature_available(portable_simd)]
macro for peeps that rather like to stay with stable rust but want to use some compiler feature as soon as they reach stable
Once it is stabilized, you could update the macro from rustversion::nightly
to rustversion::since(1.xx)
to use this feature on all Rust versions since its stabilization, or just remove the workarounds if you don't need to support earlier Rust compiler versions.
A lot of mischief can also be done with #[cfg_attr]
.
Is there any work done doday to use simd ? I see we are 1/9 and there is not much activity at portable_simd.
Does this part of Rust requires help ?
That first task was most of the work--with the nightly compiler you can use std::simd
today. Most of the improvements being worked on now are relatively minor in comparison, but we are always open to contributions.
At this point we are mostly focusing on usability rather than features, which really means two things: ease of use of the API, and quality of the code generation. Once a base set of features is in a good state, we can begin the RFC process.
Thanks for info and the kind reply!
@agausmann
- To enable the experimental feature flag on nightly,
#![rustversion::attr(nightly, feature(portable_simd))]
Unfortunately, this particular code doesn't work
Do you think this has a chance to get stabilized ? It seems like activity has been very low recently, despite being a cool feature
Do you think this has a chance to get stabilized ? It seems like activity has been very low recently, despite being a cool feature
I think it will be stabilized but not right away, afaict that still needs a RFC with the full detailed design.
@Inspirateur I think there's two big things needed:
__m256
and using intrinsics, that's fine. We could do, say, aggregations and masks in a v2, shuffles and swizzles in a v3, or something.)@agausmann
- To enable the experimental feature flag on nightly,
#![rustversion::attr(nightly, feature(portable_simd))]
@safinaskar Unfortunately, this particular code doesn't work
This worked for me:
#![cfg_attr(feature = "from_slice", feature(portable_simd))]
where "from_slice" is the name of my the-other-kind-of-feature, defined in Cargo.toml, that uses portable_simd.
[features]
from_slice = []
So, I run tests, for example, via cargo test --features=from_slice
.
Is this on the 2024 edition roadmap, or will it be only for after that? I know it’s not related, but gives me a timeline range.
I don't think anyone has a specific timeline, but we still need to draft a new RFC and go through the approval process, which can take some time.
Is there a particular reason that Simd
does not implement Deref
and DerefMut
? I don't see any reason the impls would restrict the ability to do anything.
Like deref into a slice? Usually that's not done because it's a huge performance footgun.
It may be good to document what that footgun is and why the choice was made because people will ask this again in future.
So, to add more detail: the problem is that (depending on SIMD used) you can't in general index to a particular lane of a SIMD register. So if you view the SIMD data as a slice and operate on an element of the slice, what the hardware must do is have the CPU stop the current SIMD processing, write the register to the stack, work on the stack value (however the slice is adjusted), and then load that back into a SIMD register. This is, in general, a performance disaster. As usual, the optimizer might be able to cut out this stall in the pipeline, in some cases, depending on circumstances, etc etc. But you should expect that the SIMD handling is totally stalled when trying to treat the data as a slice.
I figured there was a reason, but I'm not familiar with how SIMD works under the hood. Given that indexing is the problem, why implement Index
and IndexMut
then?
Oh, uh, well I haven't looked in a while! I guess I'm out of the loop on the current API details.
I'm surprised that Index is in if Deref is out. Either both should be in or both should be out, would be my expectation.
The basic idea is that we want a clear marker of the boundary between SIMD and non-SIMD operations. When using Index (vector[i]
) there is an obvious sign that you are no longer using SIMD operations. Likewise with arrays and slices, we implement AsRef and the to_array
function because these are explicit. The concern with Deref is that the automatic inclusion of all slice functions makes it harder to tell which operations are SIMD. For example, you may expect is_ascii
to be vectorized, but instead it is simply a scalar implementation inherited from slices.
vector[i]
isn't particularly more obvious, I would say.
Maybe we should just always make people convert to an array to index elements?
A while ago we didn't implement Index and we got requests for it, but this is the first time Deref has come up, so I think it's a good compromise. Maybe it's not particularly obvious that Index is the boundary, but Deref is completely invisible without consulting the docs.
There are certain types of instructions where the output data type is different from the input data type like: _mm256_maddubs_epi16. I don't think there is a way to do that in portable simd without casting first which is slower? Are there any plans to support these instructions. Similar instructions also exist on arch: vdotq_s32
Hi, I was wondering if there had been any discussion or consideration of making a dynamically sized api for vector operations. The current api seems to be analogous to arrays, but perhaps a more elegant and convenient solution would be analogous to slices.
I learned about this idea when researching risc-v's vector extension. Both this article and this one (fully rendered here) are good references on the motivation, from the perspective of an ISA.
While the current api is already much better than traditional simd instructions, it seems to me that the logical conclusion is a runtime sized type; maybe a wrapper around &mut [T]
, or a type like Vec<T>
, or perhaps a modification to Vec<T>
that guarantees simd optimization if T
is a numeric primitive.
Hopefully this can spark a useful discussion on the best design of simd/vector types and operations. Thank you for your consideration.
That could be some additional API that lives along aside the fixed sized SIMD types, but for the main CPU arches a fixed sized simd type is what generally works best with optimizations.
Curious how ARM SVE and RISC-V V are meant to be used in Rust. The fixed-length abstraction is a nice one, and it's what .NET is going with in .NET 9 (Vector<T>
for SVE is 128-bit, at least for now), but variable-length vectors are here to stay.
Curious how ARM SVE and RISC-V V are meant to be used in Rust. The fixed-length abstraction is a nice one, and it's what .NET is going with in .NET 9 (
Vector<T>
for SVE is 128-bit, at least for now), but variable-length vectors are here to stay.
RISC-V offers extensions like Zvl128b
that provide hard guarantees on minimum vector size. It should be possible to leverage this in the interim while RISC-V figures out their P extension (which isn't very far along).
Edit: fix extension name
Would it it make sense to add a family of functions like "loadbase*" that take a slice and an isize index? It would account for buffer underflow as well as overflow. With this you can write things such as convolution with nice clean loops that don't have account for edge cases.
for i in 0..image.len(){
let mut result = 0.;
for j in 1..kernel.radius() / N {
let left = Simd::<N>::load_base_or(image, i - j * N, splat(image[0]);
//...
}
for j in 0..kernel.radius() / N {
let right= Simd::<N>::load_base_or(image, i + j * N, splat(image.last());
//...
}
image[i] = result;
}
Is it possible to move Mask
inherent methods into a trait like SimdMask
and add this trait as a bound to associated type Mask
of other traits, e.g. SimdPartialEq
?
This will help to write generic code that works for different primitive types.
Got this idea while writing a fixed index map data structure that is expected to work with unsigned integer keys regardless of the width.
Without the trait bound for Mask
associated type I have to wrap my implementation into macros and explicitly apply it to u8, u16, u32, u64 and usize.
I think you should probably be able to do what you want:
fn generic<T>(v: Simd<T, 4>, m: Mask<T::Mask, 4>) -> bool
where
T: SimdElement + Default,
Simd<T, 4>: SimdPartialEq<Mask = Mask<T::Mask, 4>>,
{
(v.simd_eq(Simd::splat(Default::default())) ^ m).all()
}
However, it would be nice if there were an easier way to do this without requiring that extra bound.
@calebzulawski , thank you!
It worked along with a couple of bounds from num-traits
crate.
Maybe an example with generic code will be a useful demo of bounds usage.
Feature gate:
#![feature(portable_simd)]
This is a tracking issue for the future feature chartered in RFC 2977, with the intent of creating something akin to the design in RFC 2948 (rust-lang/rfcs#2948): a portable SIMD library (
std::simd
).Portable SIMD project group: https://github.com/rust-lang/project-portable-simd Implementation: https://github.com/rust-lang/portable-simd
More discussion can be found in the #project-portable-simd zulip stream.
Steps
Unresolved Questions
Implementation History