alpaka-group / vikunja

Vikunja is a performance portable algorithm library that defines functions operating on ranges of elements for a variety of purposes . It supports the execution on multi-core CPUs and various GPUs. Vikunja uses alpaka to implement platform-independent primitives such as reduce or transform.
https://vikunja.readthedocs.io/en/latest/
Mozilla Public License 2.0
14 stars 5 forks source link

Verify manuall loop unrolling in BlockThreadReduceKernel and overwork it #43

Open SimeonEhrig opened 2 years ago

SimeonEhrig commented 2 years ago

There is comment BlockThreadReduceKernel, that the manual unrolling is not tested.

https://github.com/alpaka-group/vikunja/blob/b3e3dc36391010c76bb4f8f9181a604098136608/include/vikunja/reduce/detail/BlockThreadReduceKernel.hpp#L114

This should be done and @bernhardmgruber mentioned in the PR #40 that some parts could be illegal and provide some improvements, also for the readability.

Related to the topic: https://github.com/alpaka-group/vikunja/pull/40#discussion_r737289898