codeplaysoftware / portBLAS

An implementation of BLAS using the SYCL open standard.
Apache License 2.0
258 stars 50 forks source link

Fix benchmarks #477

Closed pgorlani closed 11 months ago

pgorlani commented 1 year ago

This patch allows the following benchmarks to synchronize on the operator event rather than the completion of all the tasks in the queue. This improves the overall time measurement especially for the NVIDIA GPU target.