alpaka-group / vikunja

Vikunja is a performance portable algorithm library that defines functions operating on ranges of elements for a variety of purposes . It supports the execution on multi-core CPUs and various GPUs. Vikunja uses alpaka to implement platform-independent primitives such as reduce or transform.
https://vikunja.readthedocs.io/en/latest/
Mozilla Public License 2.0
14 stars 5 forks source link

Avoid repeated calls to config retrieval in workdiv calculation. #8

Closed DerWaldschrat closed 5 years ago

DerWaldschrat commented 5 years ago

Currently, both the OpenMP and the CUDA workdiv use a runtime-computed value for one of their workdiv parameters. For CUDA, this requires a call to cudaGetDeviceProperties, which turns out to be very costly in applications where all the data is available on the GPU. For OpenMP, std::thread::hardware_concurrency is used. Both values should be cached somehow. This is not trivial as the workdiv is defined by a policy with static methods.

DerWaldschrat commented 5 years ago

Fixed in 044ce3008ad86136443b84efc53674cb7b760b05