ROCm / rocBLAS

Next generation BLAS implementation for ROCm platform
https://rocm.docs.amd.com/projects/rocBLAS/en/latest/
Other
340 stars 157 forks source link

Improve the performance of gemvn by reducing the threads/block (#2649) #1461

Closed NaveenElumalaiAMD closed 3 weeks ago

NaveenElumalaiAMD commented 4 weeks ago

Sure, the changes in this PR are already in the develop branch here:https://github.com/ROCm/rocBLAS/commit/b70516c79f995762addf1120483e0ca5831df6dc