kokkos / stdBLAS

Reference Implementation for stdBLAS
Other
118 stars 22 forks source link

Matrix product always falls back to native implementation #248

Open srinivasyadav18 opened 1 year ago

srinivasyadav18 commented 1 year ago

I am trying to implement HPX backend for matrix_product. I am not certainly clear with this section of code (below). The following always evalutes to false right ? Because both qualified lookup and unqualified lookup of matrix_product return void, so the second part of the expression always evaluates to false and hence whole std::enable_if_t is false.

template <class Exec, class A_t, class B_t, class C_t>
struct is_custom_matrix_product_avail<
  Exec, A_t, B_t, C_t,
  std::enable_if_t<
    std::is_void_v<
      decltype(
          matrix_product(
          std::declval<Exec>(),
          std::declval<A_t>(),
              std::declval<B_t>(),
              std::declval<C_t>()))
      >
    && !std::is_same_v< // see #218
      decltype(
        std::experimental::linalg::matrix_product(
          std::declval<Exec>(),
          std::declval<A_t>(),
              std::declval<B_t>(),
              std::declval<C_t>())),
      decltype(
        matrix_product(
          std::declval<Exec>(),
          std::declval<A_t>(),
              std::declval<B_t>(),
              std::declval<C_t>()))
      >
    && !linalg::impl::is_inline_exec_v<Exec>
    >
  >
  : std::true_type{};

Which makes this : https://github.com/kokkos/stdBLAS/blob/main/include/experimental/__p1673_bits/blas3_matrix_product.hpp#L757 always false and hence falls backs to native sequential implementation.

fnrizzi commented 1 year ago

hi @srinivasyadav18 , pinging @mzuzek since he is the one who posted #218

mzuzek commented 1 year ago

@srinivasyadav18 @fnrizzi Hi! Thanks for rising this. I've just created #249 which reverts my incorrect fix #222 for #218 - essentially removing the second condition. For more details and background, please see the discussion at https://github.com/kokkos/stdBLAS/issues/218#issuecomment-1460039688.

srinivasyadav18 commented 1 year ago

@fnrizzi @mzuzek

Thank you.

249 Works for me now as expected. It is being dispatched to the correct overload now.