jfalcou / eve

Expressive Vector Engine - SIMD in C++ Goes Brrrr
https://jfalcou.github.io/eve/
Boost Software License 1.0
965 stars 58 forks source link

[FEATURE] shuffle_v2 - see if we can rely on the compiler for `mask`, `maskz` #1890

Open DenisYaroshevskiy opened 4 months ago

DenisYaroshevskiy commented 4 months ago

At the moment the effort to support maskz versions of operations is

a) duplicated: https://github.com/jfalcou/eve/blob/6f2421bda8384679c1f71bb8a49ac219e9377846/include/eve/detail/shuffle_v2/simd/x86/shuffle_l2.hpp#L518-L557

b) untested (I only concerned myself with explicit names)

c) mask with registercases are not addressed at all.

=====================

I suspect compiler can merge the non masked operation + blend with a masked operation.

So - this needs to be checked for sve and avx512. Bugs filed if not.

mask(z) logic moved into shuffle_driver. All the zero handling removed.

DenisYaroshevskiy commented 4 months ago

Probably somewhere after this function: shuffle_v2_driver_multiple_registers

https://github.com/jfalcou/eve/blob/6f2421bda8384679c1f71bb8a49ac219e9377846/include/eve/detail/shuffle_v2/simd/common/shuffle_v2_driver.hpp#L232

The tests are split into two files: test/unit/api/regular/shuffle_v2/shuffle_v2_driver.cpp test/unit/api/regular/shuffle_v2/shuffle_v2_driver_intergration.cpp

I'm not sure which one to add to at the moment.

You will also need to clean up some P::has_zeroes from include/eve/detail/shuffle_v2/simd/x86/shuffle_l2.hpp include/eve/detail/shuffle_v2/simd/arm/sve/shuffle_l2.hpp

DenisYaroshevskiy commented 3 months ago

Seems like both clang and gcc can do it, at least in some cases. https://godbolt.org/z/h68WxonaT