Closed fcharras closed 1 year ago
TY @jjerphan I've applied your suggestions
I also just noticed while updating the TopK code, that I put wrong numbers in the predicted local memory usage of the kmeans kernels (kmeans++, lloyd and compute_labels), which is needed to run on CPU.
When fixing it I also saw that sometimes SYCL refuses the kernel call if the predicted local memory is too close from the maximum amount of available local memory, so I also ensured to leave a free buffer. (probably that the device itself use it to optimize and load things)
It's a small diff so I took the liberty to hijack it into the last commit.
TY for review @jjerphan
For the latter fix it will be interesting to test during the next session on the devcloud if clinfo also fails at finding
global_memory_cache_size
, else we could usepyopencl
.