NVIDIA / CUDALibrarySamples

CUDA Library Samples
Other
1.52k stars 318 forks source link

cusparse csrsort example #145

Closed jinz2014 closed 1 year ago

jinz2014 commented 1 year ago

Please add an example for csrsort when the function (e.g. cusparseDgthr) is undefined in CUDA 12.0. Thanks.

qanhpham commented 1 year ago

Hi @jinz2014, here's an example for csrsort without using cusparseDgthr:

// A is a 3x3 sparse matrix, base-0
//     | 1 2 3 |
// A = | 4 5 6 |
//     | 7 8 9 |
const int m = 3;
const int n = 3;
const int nnz = 9;
csrRowPtr[m+1] = { 0, 3, 6, 9}; // on device
csrColInd[nnz] = { 2, 1, 0, 0, 2,1, 1, 2, 0}; // on device
csrVal[nnz] = { 3, 2, 1, 4, 6, 5, 8, 9, 7}; // on device
size_t pBufferSizeInBytes = 0;
void *pBuffer = NULL;
int *P = NULL;

// step 1: allocate buffer
cusparseXcsrsort_bufferSizeExt(handle, m, n, nnz, csrRowPtr, csrColInd, &pBufferSizeInBytes);
cudaMalloc( &pBuffer, sizeof(char)* pBufferSizeInBytes);

// step 2: setup permutation vector P to identity
cudaMalloc( (void**)&P, sizeof(int)*nnz);
cusparseCreateIdentityPermutation(handle, nnz, P);

// step 3: sort CSR format
cusparseXcsrsort(handle, m, n, nnz, descrA, csrRowPtr, csrColInd, P, pBuffer);

// step 4: gather sorted csrVal
// cusparseDgthr(handle, nnz, csrVal, csrVal_sorted, P, CUSPARSE_INDEX_BASE_ZERO);
// as cusparseDgthr was removed, we replace it by cusparseGather as follows:

cusparseSpVecDescr_t vec_csr_val_sorted;
cusparseDnVecDescr_t vec_csr_val;

cusparseCreateSpVec(&vec_csr_val_sorted, nnz, nnz, P, csrVal_sorted,
                                        CUSPARSE_INDEX_32I,
                                        CUSPARSE_INDEX_BASE_ZERO, CUDA_R_32F);

cusparseCreateDnVec(&vec_csr_val, nnz, csrVal, CUDA_R_32F);
cusparseGather(handle, vec_csr_val, vec_csr_val_sorted);
jinz2014 commented 1 year ago

Thanks!