If I try to force it, I get a ZE_RESULT_ERROR_INVALID_GROUP_SIZE_DIMENSION. Is there any way to run with block size with more than 1 thread? Am I missing something? If needed, I can prepare a test-case.
Note that, when using OpenCL I can run on the same Intel HD Graphics with a block of 256x256 as a local work-group size.
I have a question about running 2D kernels with level zero and the number of threads as block size for each dimension.
In level zero, I am using these set of calls to setup the number of threads:
I noticed that, after the LevelZero suggestion (
zeKernelSuggestGroupSize
), the groupSize for the Y dimension (groupSizeY
variable) is set to 1.On my GPU, I see that I can actually run 256x256:
If I try to force it, I get a
ZE_RESULT_ERROR_INVALID_GROUP_SIZE_DIMENSION
. Is there any way to run with block size with more than 1 thread? Am I missing something? If needed, I can prepare a test-case.Note that, when using OpenCL I can run on the same Intel HD Graphics with a block of 256x256 as a local work-group size.
Any pointers will be appreciated.
Hardware & Drivers: