Earlier, we were using a transpose of the distributed scattering matrix with an explicit/hand written call.
This was super slow, and should have been implemented using the ScaLAPACK pdtran function.
This PR rectifies this for a serious speedup on the symmetrization of the matrix, which is (A+A^T)/2 and is necessary to get a good result from the relaxons solver.
Earlier, we were using a transpose of the distributed scattering matrix with an explicit/hand written call. This was super slow, and should have been implemented using the ScaLAPACK pdtran function.
This PR rectifies this for a serious speedup on the symmetrization of the matrix, which is (A+A^T)/2 and is necessary to get a good result from the relaxons solver.