Closed ipiszy closed 7 months ago
Hi @ipiszy ,
This is documented in the cublasLtMatmul() section in the part that describes supported data types. Quote:
To use FP8 kernels, the following set of requirements must be satisfied:
- All matrix pointers must be 16-byte aligned.
- A must be transposed and B non-transposed (The “TN” format).
- The compute type must be CUBLAS_COMPUTE_32F.
- The scale type must be CUDA_R_32F.
Let me know if you have any other questions regarding FP8 in particular or cuBLAS in general.
I see, sorry I missed these in the cuBLAS doc. Thank you!
Okay I understand the reason now, I think it's because cuBLASLT requires column major and TN format, so the default example requires m, k all 16-bytes aligned.
Hi,
Wonder what's the minimum alignment requirement for cuBLAS fp8 Matmul (the LtFp8Matmul example here)? I tried different M, N, K combinations. e.g. when I set M to 63, I got error messages like:
It seems that cuBLAS only supports alignment: M%16 == 0, K%16==0, N%8==0. But I cannot find this info from any official document.