Currently only low-level register-based interface is exposed. Even the 16 lane sort is small enough and uses few enough registers on e.g. AVX2 that it makes sense to inline it and pass both input and gather output via SIMD registers without going to memory.
A higher-level, memory based interface can be exposed in the future.
Currently only low-level register-based interface is exposed. Even the 16 lane sort is small enough and uses few enough registers on e.g. AVX2 that it makes sense to inline it and pass both input and gather output via SIMD registers without going to memory.
A higher-level, memory based interface can be exposed in the future.