xiangyu / aparapi

Automatically exported from code.google.com/p/aparapi
Other
0 stars 0 forks source link

Aparapi local global dims #125

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
A small thing I found: In #enqueueKernel in aparapi.cpp, you got a fix to 
ensure that the local dim does not get bigger than the amount which is possible 
on the graphics card. However, the implementation forgets to check for the 
globalDim to be divisible by the local dim size. If this is not the case, the 
execution will fail.

To resolve the issue, replace

range.localDims[0] = std::min((cl_uint)range.localDims[0], max_group_size[0]);

by

range.localDims[0] = std::min((cl_uint)range.localDims[0], max_group_size[0]);
if (range.globalDims[0] % range.localDims[0] != 0) {
   int groupCount = (range.globalDims[0] / range.localDims[0]) + 1;
   range.globalDims[0] = range.localDims[0] * groupCount;
}

That's it!

Matthias

Original issue reported on code.google.com by matthias.klass@gmail.com on 23 Jul 2013 at 8:01

GoogleCodeExporter commented 9 years ago
(in a more formatted form:
https://github.com/klassm/aparapi-clone/commit/e602715e2fb497c93f68ab3e71601daa9
41fc155
)

Original comment by matthias.klass@gmail.com on 23 Jul 2013 at 8:02

GoogleCodeExporter commented 9 years ago
Nice catch. I'm working on checking in some additional example code shortly, so 
I'll take a look at this as well.

Gary, any feedback before I implement the proposed fix?

Original comment by pnnl.edg...@gmail.com on 24 Jul 2013 at 4:57

GoogleCodeExporter commented 9 years ago
No this looks good. 

This must have been a 'copy error'.. Matthias thanks for the patch. 

Ryan thanks for following up. 

Gary

Original comment by frost.g...@gmail.com on 24 Jul 2013 at 7:31

GoogleCodeExporter commented 9 years ago

Original comment by frost.g...@gmail.com on 7 Aug 2013 at 1:56