NVIDIA / CUDALibrarySamples

CUDA Library Samples
Other
1.53k stars 320 forks source link

some issues during the use of CUSAPRSELT #219

Open zhoeujei opened 5 days ago

zhoeujei commented 5 days ago

I have quite a few questions and hope to receive your answers. 1.You mentioned that CUSPARSELT supports dense matrix dense matrix = dense matrix, but I encounter an error when initializing cusparseltmatmuldescriptorinit in this setup.

2.Is the parameter cusparseLtSparsity_t of cusparseLtStructuredDescriptorInit only allowed to be set to CUSPARSELT_SPARSITY_50_PERCENT? I get a compilation error when I set it to CUSPARSELT_SPARSITY_25_PERCENT.

3.During my testing, I found that the efficiency of the sparse matrix a is the same whether it is all ones or all zeros. Shouldn't the all-zeros matrix be more efficient?

4If AB_TYPE == INT8, I found that the opB of matrix B must be set to transpose, which is inconsistent with the description in your API.

j4yan commented 2 days ago

@zhoeujei

  1. In the documentation, "NVIDIA cuSPARSELt is a high-performance CUDA library dedicated to general matrix-matrix operations in which at least one operand is a sparse matrix"

  2. Currently only _50PERFCENT is supported. See [cusparseLtSparsity_t] (https://docs.nvidia.com/cuda/cusparselt/types.html#cusparseltsparsity-t) in the documentation.

  3. The number of zero values generally doesn't affect efficiency since zero values that are not trimmed are considered as non-zero values in the computation. This is different from unstructured sparse operations.

  4. For AB_TYPE == INT8, the following are supported: 1) If A is sparse: opA=non-transpose orderA=row-major, and opA=tranpose orderA=col-major; opB and ordreB can be any combination. 2) If B is sparse: opB=non-transpose orderB=col-major, and opB=tranpose orderB=row-major; opA and ordreA can be any combination. If any combination of data types + op + order are not supported, cusparseLtMatmulDescriptorInit() would exit. Please set env CUSPARSELT_LOG_LEVEL=1 (see the logging features in the documentation) to check it out.

Let us know if you have more questions.