Open EdorianDark opened 4 years ago
The current plan is for only fixed length vectors in the initial wave.
The reasoning is:
I suspect that "some day" we would like to build support for variable sized vectors, but they'd have to be stable and well understood in terms of the vendors, and also Rust itself might need better dynamically sized type support.
Even then, it might be more appropriate to limit those APIs to core::arch
if they're only practical on one or two arches.
It is slightly confusing to call these vectors "variable length", as these are variable mostly in terms of the specification and the instruction set. Obviously, even that amount of variation has implications on programming against them, however, vector registers actually have to be of a given size, having material existence and all that, so the "variable" is literally wired into the architecture. Thus the term "scalable" and "length agnostic" in most of Arm's material, rather than "variable", as it does not truly vary. The vector is, in the case of Arm's SVE, a 128-bit vector or a vector of the implemented size, e.g. the Fugaku supercomputer implemented 512-bit vectors so it can either execute Neon instructions on 128-bit sub-vectors or use the full SVE vector register (a situation somewhat like e.g. how AVX512 and SSE actually share register space). Likewise in RISCV-V:
Each vector register has a fixed VLEN bits of state.
The instruction sets permit acting on ambiguous vector sizes, and to the extent that support is implemented in LLVM, they are already supported. To the extent they are not supported in LLVM, it is rather difficult to add more.
As mentioned on Zulip, Libre-SOC's SimpleV supports both fixed-length and dynamically variable-length vectors with lengths from 0 to 64 elements (includes non-powers of 2). Libre-SOC is based on OpenPower (PowerPC), though we are still in-progress converting SimpleV's documentation from RISC-V, which we used previously.
Libre-SOC is a project to build a Libre-licensed hybrid CPU/GPU with drivers for modern 3D graphics APIs that is currently mostly funded by NLNet. Everyone's welcome to help out!
Note that RISC-V and ARM are far from the first ISAs to have dynamically variable-length vectors: the Cray family of supercomputers are some of the first, 45 years ago.
RISC-V V extension (variable length vectors) has hit 1.0 frozen: https://github.com/riscv/riscv-v-spec/releases/tag/v1.0
To elaborate somewhat:
Our Simd
type is Sized
. This means that it cannot be ?Sized
. Functionally, we can have such a type (SimdVec
? DynSimd
?) but it would have to be a new extension to the library. std::simd
is still capable of compiling to such targets like SVE and RVV if Rust (and therefore LLVM) is given a minimum hardware vector width it can assume (this is obvious in the case of Arm SVE: 128 bits at minimum).
Portable simd could also internally perform the simd operations in a loop if the minimum hardware vector width is very low. Loop fusion may then be able to combine the loops to reduce register spilling and if there is already an outer loop in the user code it the inner and outer loop may be fusable too.
Yes, it is my belief that SVE and RVV are extensions designed for compilers to make it very easy to trivially fuse loops, whereas previous extensions required a lot of highly "manual" work (automated by the compiler, but it required careful programming to teach the compiler to correctly stripmine for a given hardware vector, maybe involving scalar branches... here it's a fairly consistent set of instructions even into the "tail").
For those you might be wondering about the minimums for RVV see: https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#181-zvl-minimum-vector-length-standard-extensions
Is there anywhere else I can look to follow this issue and or anything I can do to help while I haven't worked on compilers or something like this, I could still gather useful information?
There are some variable SIMD designs upcomming form RISC-5 or Arm. It be a good idea to extend stdsimd to these features or document why this is not possible.