Actually, I am working on registering a Plugin for an Operator(Einsum) which is not currently supported in TensorRT. So, instead of implementing a CUDA Kernel, I want to use the CuBLAS Library for Batch Matrix Multiplication.
The Equations I want to implement is(from Einsum Operator):
"ntg, ncg → nct" and " nct, ncp-> ntp"(for Batch Matrix Multiplication)
Hi @lebedov,
Thanks for your Great Work.
Actually, I am working on registering a Plugin for an Operator(Einsum) which is not currently supported in TensorRT. So, instead of implementing a CUDA Kernel, I want to use the CuBLAS Library for Batch Matrix Multiplication.
The Equations I want to implement is(from Einsum Operator): "ntg, ncg → nct" and " nct, ncp-> ntp"(for Batch Matrix Multiplication)
Info about Einsum op: https://github.com/onnx/onnx/blob/master/docs/Operators.md#Einsum I needed a guidance in using CuBLAS Library for Batched Matrix Multiplication for the above two Ops.
I am referring to the Available references(https://docs.nvidia.com/cuda/cublas/index.html#cublas-lt-t-gt-gemmbatched), but I am not getting how to use it for the above Equations.
Can you please assist me for the same?
Thanks in Advance, Darshan C G