ICLDisco / dplasma

DPLASMA is a highly optimized, accelerator-aware, implementation of a dense linear algebra package for distributed heterogeneous systems. It is designed to deliver sustained performance for distributed systems where each node featuring multiple sockets of multicore processors, and if available, accelerators, using the PaRSEC runtime as a backend.
Other
10 stars 8 forks source link

Add control dependencies in SYRK #97

Open QingleiCao opened 9 months ago

QingleiCao commented 9 months ago

Add control dependencies in SYRK to limited parallelism and therefore, memory usage.