ispc / ispc

Intel® Implicit SPMD Program Compiler
BSD 3-Clause "New" or "Revised" License
2.52k stars 316 forks source link

Improved short vector types support #2707

Open ColinChargyBentley opened 1 year ago

ColinChargyBentley commented 1 year ago

Hi,

  1. Std lib does not seems to have a large support to manipulate short vector types. For example, in my case, I need to get the max value of each item of a double<4> short vector and I have to define it by hand. See https://ispc.godbolt.org/z/P8jcx6hhh. It would be nice to have the help of the std lib to easy our life while manipulating short vector types.
  2. More importantly, when (wrongly?) defining theses function by hand, a suboptimal code is generated. In my case, on AVX or better targets, I was hoping that VMAXPD/VMINPD instruction would be used but it does not seems to be the case. It would be nice if ISPC either was able to optimize the naive min or max code into the single correct instruction or if the std lib would provide theses fct (cf 1) AND the correct single instruction code to generate with those.

In the meantime, do I have a workaround to indicate which instruction to use to compute double<4> max (in plain C++, I could use _mm256_max_ps) ?

Best regards, Colin Chargy

nurmukhametov commented 1 year ago

Hi, @ColinChargyBentley. You are right, stdlib is missing max/min functions for short vectors at the moment.

One possible approach to achieve optimal code is to use "fake varying" for implementing max/min (like @JeffRous here https://github.com/ispc/ispc/issues/2460).

uniform double<4> max(uniform const double<4> a, uniform const double<4> b)
{
    uniform double<4> r;
    foreach (i = 0 ... 4) {
        r[i] = max(a[i], b[i]);
    }
    return r;
}

See https://ispc.godbolt.org/z/ajfTq69G4