ROCm / rocBLAS

Next generation BLAS implementation for ROCm platform
https://rocm.docs.amd.com/projects/rocBLAS/en/latest/
Other
344 stars 166 forks source link

fixing predicate ordering for fp16alt impl to unbreak distance modes #1325

Closed yoichiyoshida closed 1 year ago

yoichiyoshida commented 1 year ago

fixing performance regressions caused by predicate ordering bug introduced from fp16altrnz 5.6 hotfix #1309