Open zhoeujei opened 5 days ago
@zhoeujei
In the documentation, "NVIDIA cuSPARSELt is a high-performance CUDA library dedicated to general matrix-matrix operations in which at least one operand is a sparse matrix"
Currently only _50PERFCENT is supported. See [cusparseLtSparsity_t] (https://docs.nvidia.com/cuda/cusparselt/types.html#cusparseltsparsity-t) in the documentation.
The number of zero values generally doesn't affect efficiency since zero values that are not trimmed are considered as non-zero values in the computation. This is different from unstructured sparse operations.
For AB_TYPE == INT8, the following are supported:
1) If A is sparse: opA=non-transpose orderA=row-major, and opA=tranpose orderA=col-major; opB and ordreB can be any combination.
2) If B is sparse: opB=non-transpose orderB=col-major, and opB=tranpose orderB=row-major; opA and ordreA can be any combination.
If any combination of data types + op + order are not supported, cusparseLtMatmulDescriptorInit()
would exit. Please set env CUSPARSELT_LOG_LEVEL=1
(see the logging features in the documentation) to check it out.
Let us know if you have more questions.
I have quite a few questions and hope to receive your answers. 1.You mentioned that CUSPARSELT supports dense matrix dense matrix = dense matrix, but I encounter an error when initializing cusparseltmatmuldescriptorinit in this setup.
2.Is the parameter cusparseLtSparsity_t of cusparseLtStructuredDescriptorInit only allowed to be set to CUSPARSELT_SPARSITY_50_PERCENT? I get a compilation error when I set it to CUSPARSELT_SPARSITY_25_PERCENT.
3.During my testing, I found that the efficiency of the sparse matrix a is the same whether it is all ones or all zeros. Shouldn't the all-zeros matrix be more efficient?
4If AB_TYPE == INT8, I found that the opB of matrix B must be set to transpose, which is inconsistent with the description in your API.