Open DenisYaroshevskiy opened 4 months ago
Probably somewhere after this function: shuffle_v2_driver_multiple_registers
The tests are split into two files: test/unit/api/regular/shuffle_v2/shuffle_v2_driver.cpp test/unit/api/regular/shuffle_v2/shuffle_v2_driver_intergration.cpp
I'm not sure which one to add to at the moment.
You will also need to clean up some P::has_zeroes from include/eve/detail/shuffle_v2/simd/x86/shuffle_l2.hpp include/eve/detail/shuffle_v2/simd/arm/sve/shuffle_l2.hpp
Seems like both clang and gcc can do it, at least in some cases. https://godbolt.org/z/h68WxonaT
At the moment the effort to support
maskz
versions of operations isa) duplicated: https://github.com/jfalcou/eve/blob/6f2421bda8384679c1f71bb8a49ac219e9377846/include/eve/detail/shuffle_v2/simd/x86/shuffle_l2.hpp#L518-L557
b) untested (I only concerned myself with explicit names)
c) mask with registercases are not addressed at all.
=====================
I suspect compiler can merge the non masked operation + blend with a masked operation.
So - this needs to be checked for sve and avx512. Bugs filed if not.
mask(z) logic moved into shuffle_driver. All the zero handling removed.