The values returned by GL_MAX_COMPUTE_WORK_GROUP_SIZE and GL_MAX_COMPUTE_WORK_GROUP_INVOCATIONS can be used to calculate appropriate values for local_size_* shader variables to ensure they don't overflow allowed invocations.
Before compiling each shader source, replace a custom comment with a string containing #DEFINE macros to "dynamically" set the local x, y, and z values. The shader will have #IFNDEF macros below this to ensure the shader syntax highlighting is happy.
Each shader will need to put loops in place to ensure all work gets executed
For example, in a 64x64 image with 1024 max executions, local_size_x=32, local_size_y=32, local_size_z=1, define the following uniforms: uint xRepeat = 2, xRepeatOffset = 32, yRepeat = 2, yRepeatOffset = 2
The entire shader will be wrapped in loops to handle excess executions
for (int xOffset = 0; xOffset < xRepeat * xRepeatOffset; xOffset += xRepeatOffset) {
for (int yOffset = 0; yOffset < yRepeat * yRepeatOffset; yOffset += yRepeatOffset) {
ivec2 coords = ivec2(gl_LocalInvocationID.x + xOffset, gl_LocalInvocationID.y + yOffset)
/** Now process the pixel **/
}
}
This should ensure that gl_LocalInvocationID(0,0,0) will process pixel (0,0), (0,32), (32,0), and (32, 32).
gl_LocalInvocationID(31,31,0) will process pixel (31,31), (31,63), (63,0), and finally (63, 63).
All of the in-between invocations will process the pixels in between
The values returned by
GL_MAX_COMPUTE_WORK_GROUP_SIZE
andGL_MAX_COMPUTE_WORK_GROUP_INVOCATIONS
can be used to calculate appropriate values forlocal_size_*
shader variables to ensure they don't overflow allowed invocations.Before compiling each shader source, replace a custom comment with a string containing #DEFINE macros to "dynamically" set the local x, y, and z values. The shader will have #IFNDEF macros below this to ensure the shader syntax highlighting is happy.
Each shader will need to put loops in place to ensure all work gets executed For example, in a 64x64 image with 1024 max executions, local_size_x=32, local_size_y=32, local_size_z=1, define the following uniforms:
uint xRepeat = 2, xRepeatOffset = 32, yRepeat = 2, yRepeatOffset = 2
The entire shader will be wrapped in loops to handle excess executions
This should ensure that gl_LocalInvocationID(0,0,0) will process pixel (0,0), (0,32), (32,0), and (32, 32). gl_LocalInvocationID(31,31,0) will process pixel (31,31), (31,63), (63,0), and finally (63, 63). All of the in-between invocations will process the pixels in between