alpaka-group / alpaka

Abstraction Library for Parallel Kernel Acceleration :llama:
https://alpaka.readthedocs.io
Mozilla Public License 2.0
337 stars 69 forks source link

subDivideGridElems tests uses and generates invalid workdivs #2260

Open mehmetyusufoglu opened 2 months ago

mehmetyusufoglu commented 2 months ago

the test below is already in WorkDivHelpersTest.cpp. It shows that alpaka::subDivideGridElem produces a workdiv; in which threads per block exceeds device limits (device hard properties independent of the kernel).

For example in Nvidia devices can not have more than 1024 threads per block, but the function allows for 300x300 threads per block.

 CHECK(
            alpaka::subDivideGridElems(
                Vec{300, 600},
                Vec{1, 1},
                props,
                static_cast<Idx>(0u),
                true, // the argument blockThreadMustDivideGridThreadExtent
                alpaka::GridBlockExtentSubDivRestrictions::EqualExtent)
            == WorkDiv{Vec{1, 2}, Vec{300, 300}, Vec{1, 1}});   // 300x300 threads per block exceeds the device properties !!!

These definitions for the fixture of the test are wrong should be changed: props.m_blockThreadExtentMax = Vec{256, 128};

mehmetyusufoglu commented 2 months ago

I will solve this after merging

https://github.com/alpaka-group/alpaka/pull/2251