Xtra-Computing / thundersvm

ThunderSVM: A Fast SVM Library on GPUs and CPUs
Apache License 2.0
1.55k stars 215 forks source link

latest CUDA support #222

Closed azzi23 closed 3 years ago

azzi23 commented 4 years ago

In the latest CUDA version func cusparseScsrmm2 deprecated( called from kernelmatrix_kernel.cu)

Tried to adopt for a replacement, no success, sure the case I am beginner.

    cusparseSpMatDescr_t matA;
    cusparseDnMatDescr_t vecX, vecY;
    void* dBuffer = NULL;
    size_t bufferSize = 0;
    float alpha = 1.0f;
    float beta = 0.0f;

    cusparseCreate(&handle);

    // Create sparse matrix A in CSR format
    cusparseCreateCsr(&matA, m, k, nnz,
        (void**)csr_row_ptr.host_data(),
        (void**)csr_col_ind.host_data(),
        (void**)csr_val.host_data(),
        CUSPARSE_INDEX_32I, CUSPARSE_INDEX_32I,
        CUSPARSE_INDEX_BASE_ZERO, CUDA_R_32F);

    // Create dense vector X
    cusparseCreateDnMat(&vecX, k,n,k, (void**)dense_mat.host_data(), CUDA_R_32F, CUSPARSE_ORDER_COL);

    // Create dense vector y
    cusparseCreateDnMat(&vecY, m,n,m, (void**)result.host_data(), CUDA_R_32F, CUSPARSE_ORDER_COL);

    // allocate an external buffer if needed
    cusparseSpMM_bufferSize(
        handle, CUSPARSE_OPERATION_NON_TRANSPOSE, CUSPARSE_OPERATION_TRANSPOSE,
        &alpha, matA, vecX, &beta, vecY, CUDA_R_32F,
        CUSPARSE_MM_ALG_DEFAULT, &bufferSize);

    cudaMalloc(&dBuffer, bufferSize);

    // execute SpMV
    cusparseSpMM(handle, CUSPARSE_OPERATION_NON_TRANSPOSE, CUSPARSE_OPERATION_TRANSPOSE,
        &alpha, matA, vecX, &beta, vecY, CUDA_R_32F,
        CUSPARSE_MM_ALG_DEFAULT, dBuffer);

Could you assist with above

jbuisine commented 4 years ago

Same problem here with cuda v11.0, is there any support ?

Thanks

Kurt-Liuhf commented 4 years ago

ThunderSVM supports the newest CUDA now. We have used cusparseSpMM() to replace cusparse*csrmm(). You can have a try, thanks.

jbuisine commented 4 years ago

Thanks for your answer.

I can now compile using cuda v11.0.

I tried the example from README.md file:

./bin/thundersvm-train -c 100 -g 0.5 ../dataset/test_dataset.txt

Output of training is:

2020-07-22 12:08:42,340 INFO [default] loading dataset from file "../dataset/test_dataset.txt"
2020-07-22 12:08:42,392 INFO [default] #instances = 150, #features = 4
2020-07-22 12:08:42,392 INFO [default] training C-SVC
2020-07-22 12:08:42,392 INFO [default] C = 100
2020-07-22 12:08:42,392 INFO [default] gamma = 0.5
2020-07-22 12:08:42,447 INFO [default] #classes = 3
2020-07-22 12:08:42,600 INFO [default] working set size = 64
2020-07-22 12:08:42,600 INFO [default] training start
2020-07-22 12:08:42,821 INFO [default] global iter = 0, total local iter = 122, diff = 2
2020-07-22 12:08:42,823 INFO [default] global iter = 4, total local iter = 488, diff = 0.000774812
2020-07-22 12:08:42,823 INFO [default] training finished
2020-07-22 12:08:42,823 INFO [default] obj = -1241.62
2020-07-22 12:08:42,824 INFO [default] rho = -83.0019
2020-07-22 12:08:42,824 INFO [default] #sv = 9
2020-07-22 12:08:42,824 INFO [default] working set size = 64
2020-07-22 12:08:42,824 INFO [default] training start
2020-07-22 12:08:42,828 INFO [default] global iter = 0, total local iter = 952, diff = 2
2020-07-22 12:09:15,271 INFO [default] global iter = 100, total local iter = 9901111, diff = nan
2020-07-22 12:09:48,159 INFO [default] global iter = 200, total local iter = 19901211, diff = nan
2020-07-22 12:10:21,047 INFO [default] global iter = 300, total local iter = 29901311, diff = nan
2020-07-22 12:10:53,933 INFO [default] global iter = 400, total local iter = 39901411, diff = nan
2020-07-22 12:11:26,834 INFO [default] global iter = 500, total local iter = 49901511, diff = nan
2020-07-22 12:11:59,893 INFO [default] global iter = 600, total local iter = 59901611, diff = nan
2020-07-22 12:12:33,042 INFO [default] global iter = 700, total local iter = 69901711, diff = nan
2020-07-22 12:13:06,191 INFO [default] global iter = 800, total local iter = 79901811, diff = nan
2020-07-22 12:13:39,340 INFO [default] global iter = 900, total local iter = 89901911, diff = nan
....

I've already let the computation reach 9,000 global iterations and training does not stop. This is an unexpected behavior ? Due to diff = nan ?

Kurt-Liuhf commented 4 years ago

Thank you for your feedback. Some problems might still exist in the code. Will try our best to fix it as soon as possible.

seacheo commented 3 years ago

Were you able to successfully run thundersvm with CUDA 11? I still have trouble replacing the cusparseScsrmm2 with cusparseSpMM as they use different arguments. How does the code you provided work in replacing cusparseScsrmm2. Thanks!

In the latest CUDA version func cusparseScsrmm2 deprecated( called from kernelmatrix_kernel.cu)

Tried to adopt for a replacement, no success, sure the case I am beginner.

    cusparseSpMatDescr_t matA;
    cusparseDnMatDescr_t vecX, vecY;
    void* dBuffer = NULL;
    size_t bufferSize = 0;
    float alpha = 1.0f;
    float beta = 0.0f;

    cusparseCreate(&handle);

    // Create sparse matrix A in CSR format
    cusparseCreateCsr(&matA, m, k, nnz,
        (void**)csr_row_ptr.host_data(),
        (void**)csr_col_ind.host_data(),
        (void**)csr_val.host_data(),
        CUSPARSE_INDEX_32I, CUSPARSE_INDEX_32I,
        CUSPARSE_INDEX_BASE_ZERO, CUDA_R_32F);

    // Create dense vector X
    cusparseCreateDnMat(&vecX, k,n,k, (void**)dense_mat.host_data(), CUDA_R_32F, CUSPARSE_ORDER_COL);

    // Create dense vector y
    cusparseCreateDnMat(&vecY, m,n,m, (void**)result.host_data(), CUDA_R_32F, CUSPARSE_ORDER_COL);

    // allocate an external buffer if needed
    cusparseSpMM_bufferSize(
        handle, CUSPARSE_OPERATION_NON_TRANSPOSE, CUSPARSE_OPERATION_TRANSPOSE,
        &alpha, matA, vecX, &beta, vecY, CUDA_R_32F,
        CUSPARSE_MM_ALG_DEFAULT, &bufferSize);

    cudaMalloc(&dBuffer, bufferSize);

    // execute SpMV
    cusparseSpMM(handle, CUSPARSE_OPERATION_NON_TRANSPOSE, CUSPARSE_OPERATION_TRANSPOSE,
        &alpha, matA, vecX, &beta, vecY, CUDA_R_32F,
        CUSPARSE_MM_ALG_DEFAULT, dBuffer);

Could you assist with above