Closed Sunny-bot1 closed 3 months ago
Batch count must be equal for all the matrices. Output from the run with CUBLASLT_LOG_LEVEL=1
:
[2024-05-20 14:52:29][cublasLt][3590223][Error][cublasLtMatmulAlgoGetHeuristic] Input matrices batch counts mismatch: input B matrix batchCount (1) must be equal to all other matrices batch counts, expected (2).
See the LtHSHgemmStridedBatchSimple example for more info.
Batch count must be equal for all the matrices. Output from the run with
CUBLASLT_LOG_LEVEL=1
:[2024-05-20 14:52:29][cublasLt][3590223][Error][cublasLtMatmulAlgoGetHeuristic] Input matrices batch counts mismatch: input B matrix batchCount (1) must be equal to all other matrices batch counts, expected (2).
See the LtHSHgemmStridedBatchSimple example for more info.
Thank you for your reply. Sorry, I didin't describe it completely and just take A as an example. I have set the batch count for all matrices.
int batchCount = 2;
int stridea = m * k;
int strideb = n * k;
int stridec = m * n;
cublasLtMatrixLayoutSetAttribute(Adesc, CUBLASLT_MATRIX_LAYOUT_BATCH_COUNT, &batchCount, sizeof(batchCount));
cublasLtMatrixLayoutSetAttribute(Adesc, CUBLASLT_MATRIX_LAYOUT_STRIDED_BATCH_OFFSET, &stridea, sizeof(stridea));
cublasLtMatrixLayoutSetAttribute(Bdesc, CUBLASLT_MATRIX_LAYOUT_BATCH_COUNT, &batchCount, sizeof(batchCount));
cublasLtMatrixLayoutSetAttribute(Bdesc, CUBLASLT_MATRIX_LAYOUT_STRIDED_BATCH_OFFSET, &strideb, sizeof(strideb));
cublasLtMatrixLayoutSetAttribute(Cdesc, CUBLASLT_MATRIX_LAYOUT_BATCH_COUNT, &batchCount, sizeof(batchCount));
cublasLtMatrixLayoutSetAttribute(Cdesc, CUBLASLT_MATRIX_LAYOUT_STRIDED_BATCH_OFFSET, &stridec, sizeof(stridec));
cublasLtMatrixLayoutSetAttribute(Ddesc, CUBLASLT_MATRIX_LAYOUT_BATCH_COUNT, &batchCount, sizeof(batchCount));
cublasLtMatrixLayoutSetAttribute(Ddesc, CUBLASLT_MATRIX_LAYOUT_STRIDED_BATCH_OFFSET, &stridec, sizeof(stridec));
but I still met this problem.
cuBLAS API failed with status 7
terminate called after throwing an instance of 'std::logic_error'
what(): cuBLAS API failed
Thanks. The problem with the code above is that strides must be int64_t
. Therefore, all the to set the stride fail with CUBLAS_STATUS_INVALID_VALUE
. See details regarding expected types in the documentation.
Thanks. The problem with the code above is that strides must be
int64_t
. Therefore, all the to set the stride fail withCUBLAS_STATUS_INVALID_VALUE
. See details regarding expected types in the documentation.
I see. Thank you very much!!!
Hi, when I implement fp8 gemm with batch based on the demo LtFp8Matmul, I met this problem:
I implement the batch mode like this:
and I already set the initial arg N=2 I can run the original LtFp8Matmul
architecture: Ada cuda version: 12.4
Thank for your help!!!!!!