oneapi-src / oneDPL

oneAPI DPC++ Library (oneDPL) https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/dpc-library.html
Apache License 2.0
715 stars 112 forks source link

Limit work-group size to fix FPGA slowdowns in find_or (#1653) #1667

Closed julianmi closed 1 week ago

julianmi commented 1 week ago

1653 saw significant slowdowns in FPGA emulation over mainline. These are due to high work-group sizes (65536 on my test system). This limits the work-group size to 2048, which has shown to provide a good balance between throughput and required synchronization in group reductions.

SergeyKopienko commented 1 week ago

@julianmi Thanks a lot for this fix!