DPLASMA is a highly optimized, accelerator-aware, implementation of a dense linear algebra package for distributed heterogeneous systems. It is designed to deliver sustained performance for distributed systems where each node featuring multiple sockets of multicore processors, and if available, accelerators, using the PaRSEC runtime as a backend.
Other
11
stars
9
forks
source link
Remove the taskpool pointer from the caching key #72
This will allow us to reuse the data_t and data_copy_t between algorithms from the same library, in addition to getting access to the cached values on the device.
This will allow us to reuse the data_t and data_copy_t between algorithms from the same library, in addition to getting access to the cached values on the device.