HPCE / hpce-2016-cw5

0 stars 2 forks source link

GPU Local size design #27

Closed DominicCYK closed 7 years ago

DominicCYK commented 7 years ago

I am trying to leverage the local memory to obtain a better memory access in the kernel. However, i am wondering is there designing constraint on the selection of local size because some values of local size will cause enqueueNDRange error. Can someone please provide some rule of thumb of local size design??? Thanks

m8pple commented 7 years ago

Generally speaking the local memory size is of the order of 128 KB, though it can vary a fair bit between vendors, platforms, and chip families. For example, the software and OpenCL providers can have larger local RAMs, as they are effectively just in main RAM anyway.

The limits for any platform can be queried by looking at the device configuration: https://www.khronos.org/registry/cl/sdk/1.2/docs/man/xhtml/clGetDeviceInfo.html and querying CL_DEVICE_LOCAL_MEM_SIZE. Of note is that the absolute minimum that all platforms must support is 32KB.

You could hack the test_opencl program from CW3 to read that value and print it out.

You have the advantage in this coursework of knowing exactly which platform you are using, so you could work out how much local memory there is on a K520, then optimise directly for that.