When looking over the GEMM Batched API it is clear that it accepts an attribute that describes the a_transpose and the b_transpose: _GemmBatched(const Layout layout, const Transpose a_transpose, const Transpose btranspose ..
I assume a_transpose and b_transpose reffer to the transpose status of the already Batched matrixes?
Yes, CLBlast takes only a single layout and transpose argument that applies to all batches, see also the API docs. This is also how it is done in cuBLAS.
When looking over the GEMM Batched API it is clear that it accepts an attribute that describes the a_transpose and the b_transpose: _GemmBatched(const Layout layout, const Transpose a_transpose, const Transpose btranspose ..
I assume a_transpose and b_transpose reffer to the transpose status of the already Batched matrixes?
I've checked what intel MKL library is doing for gemm_batch and they expect a pointer to an array of transpose elements, each corresponding to a matrix from the batched group. https://www.intel.com/content/www/us/en/docs/onemkl/developer-reference-c/2023-0/cblas-gemm-batch.html
I just wanted to confirm that my understanding of how clblast is using the transpose element is correct.