alpaka-group / vikunja

Vikunja is a performance portable algorithm library that defines functions operating on ranges of elements for a variety of purposes . It supports the execution on multi-core CPUs and various GPUs. Vikunja uses alpaka to implement platform-independent primitives such as reduce or transform.
https://vikunja.readthedocs.io/en/latest/
Mozilla Public License 2.0
14 stars 5 forks source link

Find a solution to compile-time block size parameter. #3

Open DerWaldschrat opened 5 years ago

DerWaldschrat commented 5 years ago

By now, the block size is a template parameter of the reduce kernel to allow for dead code elimination and unrolling. Of course, this requires that it is known at compile time. But this is problematic for accelerators which have a block size that is device-dependent, like the CpuThreads. In theory (practically, this needs to be fixed anyways, see #1 ), in this case the number of cores would form a good block size, but this is dependent on the platform. Several solutions come to mind:

By now, I have not found a satisfying solution. However, this can be delayed until #1 is resolved, because the thread-based cpu accelerators dont work as expected anyways by now. Still, I guess this requires discussion, @tdd11235813

ax3l commented 5 years ago

If one can find a good configurable and abstract way to express the third option, it is not to bad imho.

DerWaldschrat commented 5 years ago

As discussed offline with @tdd11235813, this will be left open on purpose. Possible solutions are mentioned above, but as the design of primitives is not finalized anyways, it is still not clear which is the best solution.