Open zerothi opened 3 years ago
This project is similar to SLATE, but with the following main differences: CONCEPT: DLAF uses HPX as tasking library and MPI as communication library. In this way any C++ code can profit from the asynchronous matrix API provided in DLAF to build a task parallel code.
PERFORMANCE: At large scale the performance of DLAF are better than what we measured with SLATE. (benchmarks executed on Daint supercomputer) This in an example of performance of the Cholesky decomposition available in master (multicore): The performance are also good with CUDA GPUs, (not yet merged in master): Note: Weak scaling means that the number of matrix elements per node is kept constant, therefore the plots above show results for a matrix size of 20480x20480 on 1 node, 40960x40960 on 4 nodes, ...
ROUTINES AVAILABLE: Currently our main focus is the distributed (generalized) symmetric/Hermitian eigensolver (which is only partially supported in SLATE). In the near future we plan to only provides the following routines:
Great thanks for this.
I can say I am very much interested in generalized Hermitian (complex) eigensolvers.
But as long as one can transform the matrix B
and convert matrix A
to standard form, I would be fine :)
PS. Feel free to close it unless you have a reason not to?
I will keep this issue open, to update you about the eigensolver development and inform you when it will be ready.
Are there any plans to add more decompositions like e.g. singular value decomposition?
@cpp977 No, singular value decomposition is not currently in our plan.
How does this project compare to SLATE?