Closed mkre closed 7 years ago
Hi mkre. Thanks for catching this problem. You're right, we were checking the same value for every work-item in the workgroup. This means that some work-items at the end of the loop may have overflowed.
Could you resubmit this PR targeting the 'develop' branch? We prefer to push all new changes there and then only snapshot to master during a major release.
I observed an invalid memory access with clSPARSE and CUDA 7.5 on an Nvidia Tesla K20m. The CSR-Adaptive SpMV kernel returned -9999 for some input matrices.
This is fixed by not accessing vals and cols beyond rowPtrs[stop_row]. The previous workaround didn't prevent this.
This change is![Reviewable](https://reviewable.io/review_button.svg)