Open yuao-rgb opened 4 years ago
For SuperLU_MT, unfortunately, the triangular solve is not parallelized. Since our recent efforts have focused on SuperLU_DIST, we don't have plan to do parallel triangular solve in _MT.
Yes, you can use _DIST on 1 node, with OpenMP threading. It works quite well.
One caveat: _MT use partial pivoting, _DIST uses static pivoting. Numerically _MT is more stable. Unless your systems are really ill-conditioned, you should not notice much difference.
I am applying SuperLU_MT (version 3.1) which is supported by Amesos2 to calculate preconditioner. It is found that the SuperLU_MT has great parallel performance in LU decomposition. However, when I use the inversed LU matrix as the preconditioner for solving matrix equation (e.g. Trilinos belos as gmres solver), the parallel efficiency is very low. In Release Notes of SuperLu_Dist, I found some notes about threading performance Improvement. Thus, I am trying to replace SuperLU_MT with SuperLu_Dist. Would you please give us some suggestions or let us know is SuperLu_Dist(ver 6.3.1) OpenMP parallelling implementation different with SuperLU_MT(ver 3.1)? Could I use SuperLu_Dist by disabling MPI to make it works as SuperLU_MT?