ICLDisco / dplasma

DPLASMA is a highly optimized, accelerator-aware, implementation of a dense linear algebra package for distributed heterogeneous systems. It is designed to deliver sustained performance for distributed systems where each node featuring multiple sockets of multicore processors, and if available, accelerators, using the PaRSEC runtime as a backend.
Other
10 stars 8 forks source link

bugfix: we must count the actual number of cuda devices #109

Closed abouteiller closed 5 months ago

abouteiller commented 5 months ago

to decide if we need to register (or not) the gpu workspaces

devreal commented 5 months ago

Can we have an API in PaRSEC that returns the number of devices for a given type?

abouteiller commented 5 months ago

I reworked the PR after discovering that we were already doing the device counting in testing/common.c

both points raised still stand, as the existing code could use the proposed API that count devices of a particular type, and the code handles only CUDA indeed

abouteiller commented 5 months ago

Can we have an API in PaRSEC that returns the number of devices for a given type?

https://github.com/ICLDisco/parsec/pull/621

abouteiller commented 5 months ago

I propose we move forward with this as-is to fix the buggy behavior, and integrate with the query interface in a separate PR when it has been merged in parsec.