What if maxWorkGroupSize < 64

There are few places in getDefaultSterGranulation() function where maximum supported work group size of 64=8*8 is proposed. Setting work group size in wgX and wgY to minimal values:

lines 1567-1568 for TRMV and HEMV,
lines 1637-1638 for GEMM tail,
lines 1645-1646 for TRSV and TRSV_GEMV.

You can check (maxWorkGroupSize < 64) case and set

minSuppWgX = floor(sqrt(maxWorkGroupSize));
minSuppWgY = maxWorkGroupSize / minSuppWgX;

And replace each

wgX = 8;
wgY = 8;

in mentioned cases to

wgX = minSuppWgX;
wgY = minSuppWgY;

There is code also where wgX and wgY are set up to some constants in lines 1533-1545, reducion of wgX and wgY like while((wgY * wgX) > maxWorkGroupSize) from line 1614 should be used additionally.

Just put some values in wgX and wgY and make sure pgran->wgSize[0]*pgran->wgSize[1] is less than maxWorkGroupSize for your device and everything should be fine. Feel free to ask questions, I am competent with this library code. Timur.

clMathLibraries / clBLAS

What if maxWorkGroupSize < 64 #64