intel / llvm

Intel staging area for llvm.org contribution. Home for Intel LLVM-based projects.
Other
1.25k stars 738 forks source link

Group algorithms do not support marray arguments #14391

Open Pennycook opened 4 months ago

Pennycook commented 4 months ago

Describe the bug

Section 4.17.3 of SYCL 2020 says:

All algorithms are supported for the fundamental scalar types supported by SYCL (see Table 179) and instances of the SYCL vec and marray classes.

...but DPC++ does not support the marray variants.

To reproduce

#include <sycl/sycl.hpp>

int foo(sycl::queue q) {
  q.parallel_for(sycl::nd_range<1>(1, 1), [=](sycl::nd_item<1> it) {
    sycl::marray<float, 32> block;
    sycl::reduce_over_group(it.get_sub_group(), block, sycl::maximum<>());
  });
}

Compilation gives the following error:

reproducer.cpp:6:5: error: no matching function for call to 'reduce_over_group'
    6 |     sycl::reduce_over_group(it.get_sub_group(), block, sycl::maximum<>());
      |     ^~~~~~~~~~~~~~~~~~~~~~~
/bin/compiler/../../include/sycl/group_algorithm.hpp:218:1: note: candidate template ignored: requirement 'detail::is_scalar_arithmetic<sycl::marray<float, 32>>::value || (detail::is_complex<sycl::marray<float, 32>, void>::value && detail::integral_constant<bool, false>::value)' was not satisfied [with Group = sub_group, T = sycl::marray<float, 32>, BinaryOperation = sycl::maximum<>]
  218 | reduce_over_group(Group g, T x, BinaryOperation binary_op) {
      | ^
/bin/compiler/../../include/sycl/group_algorithm.hpp:251:1: note: candidate template ignored: requirement 'detail::is_complex<sycl::marray<float, 32>, void>::value' was not satisfied [with Group = sub_group, T = sycl::marray<float, 32>, BinaryOperation = sycl::maximum<>]
  251 | reduce_over_group(Group g, T x, BinaryOperation) {
      | ^
/bin/compiler/../../include/sycl/group_algorithm.hpp:270:1: note: candidate template ignored: requirement 'detail::is_vector_arithmetic_or_complex<sycl::marray<float, 32>>::value' was not satisfied [with Group = sub_group, T = sycl::marray<float, 32>, BinaryOperation = sycl::maximum<>]
  270 | reduce_over_group(Group g, T x, BinaryOperation binary_op) {
      | ^
/bin/compiler/../../include/sycl/group_algorithm.hpp:294:1: note: candidate function template not viable: requires 4 arguments, but 3 were provided
  294 | reduce_over_group(Group g, V x, T init, BinaryOperation binary_op) {
      | ^                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
//bin/compiler/../../include/sycl/group_algorithm.hpp:314:1: note: candidate function template not viable: requires 4 arguments, but 3 were provided
  314 | reduce_over_group(Group g, V x, T init, BinaryOperation binary_op) {
      | ^                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.

marray does not satisfy is_vector_arithmetic and there is no alternative is_marray check.

Environment

Additional context

There doesn't appear to be a corresponding test for group algorithms with marray arguments in the SYCL-CTS.

AlexeySachkov commented 4 months ago

KhronosGroup/SYCL-Docs#461 is related here. I remember that we put on hold any work on marray support for group algorithms because of it, but perhaps our analysis was wrong and we should have added support for at least some of the overloads/algorithms. Tagging @jzc for awareness

rolandschulz commented 4 months ago

Before we add marray support we should decide whether we extend SPIR-V to support OpTypeArray to avoid duplicate effort. If we don't extend SPIR-V we should not forward to the SPIR-V group algorithm but instead implement the algorithm in the SYCL headers. Because it is inefficient to run the algorithm multiple times with smaller sizes instead of once for the whole array.

AlexeySachkov commented 2 months ago

Note: this was partially implemented in #14364 by @steffenlarsen