ICLDisco / dplasma

DPLASMA is a highly optimized, accelerator-aware, implementation of a dense linear algebra package for distributed heterogeneous systems. It is designed to deliver sustained performance for distributed systems where each node featuring multiple sockets of multicore processors, and if available, accelerators, using the PaRSEC runtime as a backend.
Other
11 stars 9 forks source link

Hotfix: zpotrf_dtd on CUDA-enabled systems #51

Closed therault closed 2 years ago

therault commented 2 years ago

Use the extended dtd_insert_task_with_task_class interface for the POTRF task to allow it to run on the GPU, but still support runs on CPU-only setups