the test below is already in WorkDivHelpersTest.cpp. It shows that alpaka::subDivideGridElem produces a workdiv; in which threads per block exceeds device limits (device hard properties independent of the kernel).
For example in Nvidia devices can not have more than 1024 threads per block, but the function allows for 300x300 threads per block.
the test below is already in WorkDivHelpersTest.cpp. It shows that alpaka::subDivideGridElem produces a workdiv; in which threads per block exceeds device limits (device hard properties independent of the kernel).
For example in Nvidia devices can not have more than 1024 threads per block, but the function allows for 300x300 threads per block.
These definitions for the fixture of the test are wrong should be changed: props.m_blockThreadExtentMax = Vec{256, 128};