codeplaysoftware / portBLAS

An implementation of BLAS using the SYCL open standard.
Apache License 2.0
250 stars 48 forks source link

Add fixes and new NVIDIA gemm configurations #469

Closed pgorlani closed 11 months ago

pgorlani commented 11 months ago

This patch adds new configurations for improving the gemm performance of the NVIDIA_GPU target.

Moreover,