Open Sunny-bot1 opened 1 month ago
set CUBLASLT_LOG_LEVEL=5
[2024-05-22 07:22:32][cublasLt][65020][Api][cublasLtCreate] lightHandle=0X7FFDB5ABD8F8
[2024-05-22 07:22:32][cublasLt][65020][Info][cublasLtCreate] cuBLASLt v12.4 device #0: multiProcessorCount=92 maxSharedMemPerBlock=49152 maxSharedMemoryPerBlockOptin=101376
[2024-05-22 07:22:32][cublasLt][65020][Api][cublasLtMatmulDescCreate] matmulDesc=0X7FFDB5ABD5E8 computeType=COMPUTE_32F scaleType=0
[2024-05-22 07:22:32][cublasLt][65020][Api][cublasLtMatmulDescSetAttribute] matmulDesc=0X55A2EFDA0480 attr=MATMUL_DESC_TRANSA buf=0X7FFDB5ABD5CC sizeInBytes=4
[2024-05-22 07:22:32][cublasLt][65020][Api][cublasLtMatmulDescSetAttribute] matmulDesc=0X55A2EFDA0480 attr=MATMUL_DESC_TRANSB buf=0X7FFDB5ABD5D0 sizeInBytes=4
[2024-05-22 07:22:32][cublasLt][65020][Api][cublasLtMatmulDescSetAttribute] matmulDesc=0X55A2EFDA0480 attr=MATMUL_DESC_BIAS_POINTER buf=0X7FFDB5ABD560 sizeInBytes=8
[2024-05-22 07:22:32][cublasLt][65020][Api][cublasLtMatmulDescSetAttribute] matmulDesc=0X55A2EFDA0480 attr=MATMUL_DESC_EPILOGUE buf=0X7FFDB5ABD5DC sizeInBytes=4
[2024-05-22 07:22:32][cublasLt][65020][Api][cublasLtMatrixLayoutCreate] matLayout=0X7FFDB5ABD5F0 type=R_8F_E4M3 rows=32 cols=16 ld=32
[2024-05-22 07:22:32][cublasLt][65020][Api][cublasLtMatrixLayoutCreate] matLayout=0X7FFDB5ABD5F8 type=R_8F_E4M3 rows=32 cols=64 ld=32
[2024-05-22 07:22:32][cublasLt][65020][Api][cublasLtMatrixLayoutCreate] matLayout=0X7FFDB5ABD600 type=R_16F rows=16 cols=64 ld=16
[2024-05-22 07:22:32][cublasLt][65020][Api][cublasLtMatrixLayoutCreate] matLayout=0X7FFDB5ABD608 type=R_16F rows=16 cols=64 ld=16
[2024-05-22 07:22:32][cublasLt][65020][Api][cublasLtMatrixLayoutSetAttribute] matLayout=0X55A2F0872CB0 attr=MATRIX_LAYOUT_BATCH_COUNT buf=0X7FFDB5ABD5E0 sizeInBytes=4
[2024-05-22 07:22:32][cublasLt][65020][Api][cublasLtMatrixLayoutSetAttribute] matLayout=0X55A2F0872CB0 attr=MATRIX_LAYOUT_STRIDED_BATCH_OFFSET buf=0X7FFDB5ABD618 sizeInBytes=8
[2024-05-22 07:22:32][cublasLt][65020][Api][cublasLtMatrixLayoutSetAttribute] matLayout=0X55A2F0872D00 attr=MATRIX_LAYOUT_BATCH_COUNT buf=0X7FFDB5ABD5E0 sizeInBytes=4
[2024-05-22 07:22:32][cublasLt][65020][Api][cublasLtMatrixLayoutSetAttribute] matLayout=0X55A2F0872D00 attr=MATRIX_LAYOUT_STRIDED_BATCH_OFFSET buf=0X7FFDB5ABD620 sizeInBytes=8
[2024-05-22 07:22:32][cublasLt][65020][Api][cublasLtMatrixLayoutSetAttribute] matLayout=0X55A2F0872D50 attr=MATRIX_LAYOUT_BATCH_COUNT buf=0X7FFDB5ABD5E0 sizeInBytes=4
[2024-05-22 07:22:32][cublasLt][65020][Api][cublasLtMatrixLayoutSetAttribute] matLayout=0X55A2F0872D50 attr=MATRIX_LAYOUT_STRIDED_BATCH_OFFSET buf=0X7FFDB5ABD628 sizeInBytes=8
[2024-05-22 07:22:32][cublasLt][65020][Api][cublasLtMatrixLayoutSetAttribute] matLayout=0X55A2F0872DA0 attr=MATRIX_LAYOUT_BATCH_COUNT buf=0X7FFDB5ABD5E0 sizeInBytes=4
[2024-05-22 07:22:32][cublasLt][65020][Api][cublasLtMatrixLayoutSetAttribute] matLayout=0X55A2F0872DA0 attr=MATRIX_LAYOUT_STRIDED_BATCH_OFFSET buf=0X7FFDB5ABD628 sizeInBytes=8
[2024-05-22 07:22:32][cublasLt][65020][Api][cublasLtMatmulPreferenceCreate] matmulPref=0X7FFDB5ABD610
[2024-05-22 07:22:32][cublasLt][65020][Api][cublasLtMatmulPreferenceSetAttribute] pref=0X55A2F0872E30 attr=MATMUL_PREF_MAX_WORKSPACE_BYTES buf=0X7FFDB5ABD718 sizeInBytes=8
[2024-05-22 07:22:32][cublasLt][65020][Api][cublasLtMatmulAlgoGetHeuristic] Adesc=[type=R_8F_E4M3 rows=32 cols=16 ld=32 batchCount=2 stridedBatchOffset=512] Bdesc=[type=R_8F_E4M3 rows=32 cols=64 ld=32 batchCount=2 stridedBatchOffset=2048] Cdesc=[type=R_16F rows=16 cols=64 ld=16 batchCount=2 stridedBatchOffset=1024] Ddesc=[type=R_16F rows=16 cols=64 ld=16 batchCount=2 stridedBatchOffset=1024] preference=[maxWavesCount=0.0 maxWorkspaceSizeinBytes=33554432] computeDesc=[computeType=COMPUTE_32F scaleType=R_32F transa=OP_T epilogue=EPILOGUE_BIAS biasPointer=0x7f83f3a07000]
[2024-05-22 07:22:32][cublasLt][65020][Error][cublasLtMatmulAlgoGetHeuristic] Failed to query heuristics.
cuBLAS API failed with status 15
terminate called after throwing an instance of 'std::logic_error'
what(): cuBLAS API failed
Aborted (core dumped)
when I set
cublasLtEpilogue_t epilogue = CUBLASLT_EPILOGUE_RELU_BIAS;
or
cublasLtEpilogue_t epilogue = CUBLASLT_EPILOGUE_GELU_BIAS;
It can run correctly.
only no act mode CUBLASLT_EPILOGUE_BIAS
will report the error
Hi, when I try to implement cuBLASLt FP8 batched gemm with bias based on LtFp8Matmul, I met this problem.
My code:
when I implement only batched gemm or only gemm with bias, it can run correctly.
I wonder if cublaslt support FP8 batched gemm with bias. Thank you very much!!!