Thank you for the BLAS library.
I am wondering If I can use BLAS kernels through opencl interface, where my host program can take care of data transfers through OpenCL APIs but expects the APIs for BLAS kernels such as XFBLAS_GEMM with device memory addresses? It would enable me to use the heterogenous computing units for BLAS operations.
Current APIs for BLAS are taking care of creating opencl handle, data transfers to device and reverse all together.
It would be great if the APIs of BLAS are in sync with other BLAS libraries such as CUBLAS, CLBlast, etc.
Thank you for the BLAS library. I am wondering If I can use BLAS kernels through opencl interface, where my host program can take care of data transfers through OpenCL APIs but expects the APIs for BLAS kernels such as XFBLAS_GEMM with device memory addresses? It would enable me to use the heterogenous computing units for BLAS operations. Current APIs for BLAS are taking care of creating opencl handle, data transfers to device and reverse all together. It would be great if the APIs of BLAS are in sync with other BLAS libraries such as CUBLAS, CLBlast, etc.