clMathLibraries / clSPARSE

a software library containing Sparse functions written in OpenCL
Apache License 2.0
173 stars 60 forks source link

Fix invalid memory access on Nvidia GPUs in CSR-Adaptive SpMV kernel #191

Closed mkre closed 7 years ago

mkre commented 7 years ago

I observed an invalid memory access with clSPARSE and CUDA 7.5 on an Nvidia Tesla K20m. The CSR-Adaptive SpMV kernel returned -9999 for some input matrices.

This is fixed by not accessing vals and cols beyond rowPtrs[stop_row]. The previous workaround didn't prevent this.


This change is Reviewable

jlgreathouse commented 7 years ago

Hi mkre. Thanks for catching this problem. You're right, we were checking the same value for every work-item in the workgroup. This means that some work-items at the end of the loop may have overflowed.

Could you resubmit this PR targeting the 'develop' branch? We prefer to push all new changes there and then only snapshot to master during a major release.