ICLDisco / dplasma

DPLASMA is a highly optimized, accelerator-aware, implementation of a dense linear algebra package for distributed heterogeneous systems. It is designed to deliver sustained performance for distributed systems where each node featuring multiple sockets of multicore processors, and if available, accelerators, using the PaRSEC runtime as a backend.
Other
11 stars 9 forks source link

Command line control for uplo #130

Open abouteiller opened 1 month ago

abouteiller commented 1 month ago

Description

We cannot control uplo in dpotrf (it is hardcoded), and friends. This should be a command line parameter

In addition the performance of PO Upper is really bad on some hardware (e.g., rocm) due to poor kernel optimization in rocblas/cublas itself, so being able to investigate both LO/UP performance is important

Describe the solution you'd like

--uplo upper --uplo lower