jfalcou / eve

Expressive Vector Engine - SIMD in C++ Goes Brrrr
https://jfalcou.github.io/eve/
Boost Software License 1.0
969 stars 58 forks source link

[BUG] Incorrect Results Using __mmask16 Mask in if_else Function with AVX-512 #2026

Open Panhaolin2001 opened 1 week ago

Panhaolin2001 commented 1 week ago

Here is a example: https://godbolt.org/z/nd9Kc3cqq

Panhaolin2001 commented 1 week ago

The output should not be: 4,6,8,10,12,14,16,18,20,23,29,31,37,41,43,47,53,59,61,67,71,73,79,83,89,97,101,103,107,109,113,117 ?

jfalcou commented 1 week ago

Two things happens :

DenisYaroshevskiy commented 1 week ago

Do you think you want to implicitly cast __mask16 to logical? I think that'd make sense but I also know that it doesn't work. I appreciate that we like the implicit conversions and in other places you can pass a register directly as a parameter.

What are you thinking, @jfalcou ?

jfalcou commented 1 week ago

Maybe we need to make the wrapper type over the mmask type public

DenisYaroshevskiy commented 1 week ago

I think we'd need to put up a pr and then see what can we do. I don't think I understand your comment like this.

@Panhaolin2001 - you probably know this, but here is how to write what you meant today: https://godbolt.org/z/off5xv6nG

UPD: I would also recommend looking into our formula generation constructor, since this seems like what you want (from your example): https://godbolt.org/z/PhjhPxohP

Panhaolin2001 commented 1 week ago

I'm sorry I may not have expressed myself too well. I currently made a type Mask<T> wrapped around __mmask8 __mmask16 __mmask32 and __mmask64 on avx512, and I want to make it compatible with the type eve::logical<eve::wide<T>>. One question I have is that when I do the auto mask = a > b; operation, the mask type is of type eve::logical<eve::wide<T>> and the internal storage_type is of type __mmask family, so why is it that when constructing the mask by using the auto mask = a > b if_else function returns the correct value, but not when __mmask16{0b0000000001111111} is specified manually?

jfalcou commented 1 week ago

We already have such a type. We may just need to make it public api and amend logical functions to handle it.

As for why __mask dont work, it is because they are just plain intégral so the constructor used to convert them to logical think you are doing logical(some.number) which is eqv to using logical(true).

Panhaolin2001 commented 1 week ago

Thank you for your response! Looking forward to you exposing it as a public API and improving the handling of logic functions.

jfalcou commented 1 week ago

@DenisYaroshevskiy In fact, maybe we can just have an API that let us build logical from a bit set, using maybe a tagged constructor, and optimize it on AVX512 to not do anything and, for the other arch, to do the bit to mask conversion ?

logical<wide<float>> l( as_mask{0b01101101} );

?

DenisYaroshevskiy commented 1 week ago

I don't like this bits constructor. I think the formula/initializer list constructor do this already.

This is very length specific

DenisYaroshevskiy commented 1 week ago

@Panhaolin2001

I don't know how much I understood.

If you want to write a convrersion from your type to logical, provide a conversion operator

struct my {
  __mmask16 m;

  using eve_logical_t = eve::logical<eve::wide<float, eve::fixed<16>>>;

  operator eve_logical_t () {
    return eve_logical_t{m};
  }
};
jfalcou commented 1 week ago

It has to be something like

struct my {
  __mmask16 m;

  using eve_logical_t = eve::logical<eve::wide<float, eve::fixed<16>>>;

  operator eve_logical_t () {
    eve_logical_t that;
    that.storage().value = m;
    return that;
  }
};

or else the mask is seen as an integer then turned to true/false.

DenisYaroshevskiy commented 1 week ago

You are right, I thought what I wrote works. This is horrible. It works on any other platform. We need to at least stop it from compiling

jfalcou commented 1 week ago

It is complicated cause for most compiler the __mask* stuff are typedefs to integral :/

DenisYaroshevskiy commented 1 week ago

are typedefs to integral

we should only convert from bool and not from integrals

jfalcou commented 6 days ago

Ok thus

https://github.com/jfalcou/eve/blob/c29cd38d2e2c1acf0d74d50f2b58441ea1f2d2cb/include/eve/arch/cpu/logical_wide.hpp#L126

Should go and the bool one after it should.use std::same_as to prevent conversion. Will make a PR

Panhaolin2001 commented 4 days ago

Is there any plan to convert from clang builtin vector extension to logical type?

typedef float vec_type __attribute__((vector_size(32)));
vec_type a{1,2,3,4,5,6,7,8}, b{2,3,1,4,6,7,9,4};
auto mask = a < b; 
auto eve_mask = eve::logical<eve::wide<T, eve::fixed<8>>>{mask};
DenisYaroshevskiy commented 4 days ago

Is there any plan to convert from clang builtin vector extension to logical type?

No plans.

I looked into it. Clang always uses ints for mask type, even on avx512.

Here is how I'd do it:

  typedef float vec_type __attribute__((vector_size(32)));
  using eve_vec_type = eve::wide<float, eve::fixed<8>>;
  using eve_int_type = eve::wide<std::uint32_t, eve::fixed<8>>;

  vec_type a{1, 2, 3, 4, 5, 6, 7, 8}, b{2, 3, 1, 4, 6, 7, 9, 4};

  auto mask = a < b;
  auto clang_mask = eve::bit_cast(mask, eve::as<eve_int_type>{});
  auto eve_mask = eve::bit_cast(clang_mask != 0, eve::as<eve::logical<eve_vec_type>>{});

https://godbolt.org/z/jx4839jEh