CNugteren / CLBlast

Tuned OpenCL BLAS
Apache License 2.0
1.06k stars 202 forks source link

Consider add SVM Buffer interface support? #523

Open engineer1109 opened 10 months ago

engineer1109 commented 10 months ago

Currently, the interface fot the cl_mem buffer object. How about the SVM buffer in OpenCL 2.0 ?

CNugteren commented 10 months ago

I don't have time myself to work on this, but I'm happy to review a pull request.

However, adding a new interface type can be a lot of work, especially if we want to keep the current interface as well. And it means lots of extra code and thus also lots of maintenance. In the past I've started a similar thing, but then for adding image support as input, but never finished it: https://github.com/CNugteren/CLBlast/compare/master...image_support

engineer1109 commented 9 months ago

How about this interface?

// General matrix-matrix multiplication: SGEMM/DGEMM/CGEMM/ZGEMM/HGEMM
template <typename T>
StatusCode Gemm(const Layout layout, const Transpose a_transpose, const Transpose b_transpose,
                const size_t m, const size_t n, const size_t k,
                const T alpha,
                const T* a_buffer, const size_t a_ld,
                const T* b_buffer, const size_t b_ld,
                const T beta,
                T* c_buffer, const size_t c_ld,
                const cl_context context, const cl_device device,
                T* temp_buffer = 0);

offset is not needed and should be set to 0 in the kernel. The kernel will not change at all. The host code need to change clSetKernel to clSetKernelSVMPointer

My consider.

CNugteren commented 9 months ago

Yes, I guess that could work, looks clean indeed. This would then be added into a new header, say clblast_svm.h or so? Or otherwise it would need a suffix to the function name, otherwise other users might get confused and think this is the main interface.

By the way, I'm not so familiar with SVM. What would happen if I just provide a normal C++ float pointer as a_buffer argument? I guess we'll need some error checking inside to make sure that it is an OpenCL SVM buffer?