oneapi-src / oneDPL

oneAPI DPC++ Library (oneDPL) https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/dpc-library.html
Apache License 2.0
716 stars 113 forks source link

[KT] Implement reduce then scan optimization in ESIMD scan prototype #1607

Closed mmichel11 closed 2 months ago

mmichel11 commented 2 months ago

This PR is targeting the ESIMD scan development branch.

This PR implements the reduce-then-scan approach in the ESIMD scan implementation reducing the ~4N global memory accesses to ~3N with a tradeoff of extra compute in the second kernel.

The following changes have been made:

Testing has been performed on powers-of-2 from 2^17 to 2^28.

mmichel11 commented 2 months ago

This PR will serve as the baseline for prototyping a SYCL version of the same algorithm.