gorgonia / cu

package cu provides an idiomatic interface to the CUDA Driver API.
Apache License 2.0
468 stars 62 forks source link

leading dimension of sgemm in blas #34

Open snowwalf opened 6 years ago

snowwalf commented 6 years ago

C = α op ( A ) op ( B ) + β C where α and β are scalars, and A , B and C are matrices stored in column-major format with dimensions op ( A ) m × k , op ( B ) k × n and C m × n , respectively. Also, for matrix A

op ( A ) = A if transa == CUBLAS_OP_N A T if transa == CUBLAS_OP_T A H if transa == CUBLAS_OP_C

and op ( B ) is defined similarly for matrix B

Code line https://github.com/gorgonia/cu/blob/master/blas/blas.go#L3514

if ldc*(m-1)+n > len(c) || ldc < max(1, n) {
    panic("blas: index of c out of range")
}

It seams that ldc always will be n.

However, according Nvidia cublas api document ( https://docs.nvidia.com/cuda/cublas/index.html#cublas-lt-t-gt-gemm ), ldc always will be m.

C device in/out array of dimensions ldc x n with ldc>=max(1,m).
ldc   input leading dimension of a two-dimensional array used to store the matrix C.

The same situation as lda and ldb.

So when I follow the Nvidia api doc, the params check will failed in blas.go. When I use opposite rule of lda/ldb/ldc from the doc, an error like "On entry to SGEMM parameter number 10 had an illegal value" will be returned.

It makes me confused. Would you please help me? Thanks!

chewxy commented 6 years ago

Yes.

If I may guess what you're trying to do, you're trying to multiply a row-major matrix with another row-major matrix, am I right?

So, the BLAS definition for the cuBLAS library follows heavily the gonum BLAS interface definitions - this includes alll the conditions. The main reason for doing so is compatibility - it's designed such that you can just drop in replace gonum's BLAS or OpenBLAS...

I understand this makes row-major matrix multiplication more difficult. The current workaround is to use SGEAM/DGEAM to transpose your matrix to colmajor before working it. However, if you can wait, v0.9.0 will actually automatically do that for you.

chewxy commented 6 years ago

Actually, that was all conjecture on what you were trying to do. I'm also considering an alternative which is to folllow the cuBLAS specs more closely wrt checks. So, if you could share with me the code that you wrote that led to the error, I'd be grateful.

chewxy commented 6 years ago

This should be fixed now @snowwalf . Can you check?

khushnood commented 5 years ago

In a simple language I think, If you are using row-major representation then the number of "columns" will be leading dimension and vice versa in column-major representation number of "rows".

chewxy commented 5 years ago

yup