Open ww5862 opened 4 months ago
Hello @ww5862.
Thanks for the report. A few questions:
CUBLASLT_LOG_MASK=64
(e.g. export CUBLASLT_LOG_MASK=64
in bash)? (documentation).cublasSetMathMode(handle, CUBLAS_PEDANTIC_MATH)
? ( documentation)Thank you, I will do it!
hello
I'm using cublasSgemm for compute single precision GEMM which dimension is 1024x1024x1024. If I compare cublasSgemm and CUTLASS single precision GEMM kernel, the validation is not correct. However compare result with CUTLASS single precision GEMM kernel and CPU code for validation is true. My evaluation setting is RTX3090, and nvcc version is 12.4. CuBLAS can not compute correct result if using mordern nvcc with RTX3090?