ICLDisco / dplasma

DPLASMA is a highly optimized, accelerator-aware, implementation of a dense linear algebra package for distributed heterogeneous systems. It is designed to deliver sustained performance for distributed systems where each node featuring multiple sockets of multicore processors, and if available, accelerators, using the PaRSEC runtime as a backend.
Other
11 stars 9 forks source link

Use the RoCM/HIP device to accelerate certain DPLASMA kernels #57

Closed abouteiller closed 3 months ago

abouteiller commented 2 years ago

This PR adds RoCM enabled kernels to the GEMM, PORTF and memory-aware GEMM operations.

bosilca commented 1 year ago

As discussed on 03/31/23 we need to rebase and check the result. This will be tested next week on Frontier, we need it to be ready.

bosilca commented 1 year ago

please squash to fewer commits.

abouteiller commented 6 months ago

This is in ready to merge state beside the 'squash to less commits'.

abouteiller commented 6 months ago

I decided to go for a squash merge, this is ready for final review @bosilca @devreal

Error in ctest is #115, preexisting and unrelated to hip.