CNugteren / CLBlast

Tuned OpenCL BLAS
Apache License 2.0
1.06k stars 202 forks source link

GEMM Batched Question #480

Closed FloreaMario closed 1 year ago

FloreaMario commented 1 year ago

When looking over the GEMM Batched API it is clear that it accepts an attribute that describes the a_transpose and the b_transpose: _GemmBatched(const Layout layout, const Transpose a_transpose, const Transpose btranspose ..

I assume a_transpose and b_transpose reffer to the transpose status of the already Batched matrixes?

I've checked what intel MKL library is doing for gemm_batch and they expect a pointer to an array of transpose elements, each corresponding to a matrix from the batched group. https://www.intel.com/content/www/us/en/docs/onemkl/developer-reference-c/2023-0/cblas-gemm-batch.html

I just wanted to confirm that my understanding of how clblast is using the transpose element is correct.

CNugteren commented 1 year ago

Yes, CLBlast takes only a single layout and transpose argument that applies to all batches, see also the API docs. This is also how it is done in cuBLAS.

FloreaMario commented 1 year ago

Thank you!