DPLASMA is a highly optimized, accelerator-aware, implementation of a dense linear algebra package for distributed heterogeneous systems. It is designed to deliver sustained performance for distributed systems where each node featuring multiple sockets of multicore processors, and if available, accelerators, using the PaRSEC runtime as a backend.
Use the extended dtd_insert_task_with_task_class interface for the POTRF task to allow it to run on the GPU, but still support runs on CPU-only setups