deepmodeling / abacus-develop

An electronic structure package based on either plane wave basis or numerical atomic orbitals.
http://abacus.ustc.edu.cn
GNU Lesser General Public License v3.0
174 stars 136 forks source link

feature: parallel solve subspace diagonalization in dav_subspace #5549

Open pxlxingliang opened 1 day ago

pxlxingliang commented 1 day ago

Reminder

Linked Issue

Fix #5480

Unit Tests and/or Case Tests for my changes

What's changed?

Any changes of core modules? (ignore if not applicable)

haozhihan commented 12 hours ago

Diago_HS_para implements parallel solution for the subspace process in dav-subspace. If this is efficient, can we just fix one solution method to reduce the burden of users?

haozhihan commented 11 hours ago

I have a brief insight regarding this PR:

This process involves a transformation of parallel strategy.

Can this transformation of parallel strategy be more general?

If the H and S matrices of LCAO are solved by iterative method (like cg, dav, and so on, usually used for plane wave basis), it will also involve almost the same transformation of parallel strategy.

pxlxingliang commented 11 hours ago

Diago_HS_para implements parallel solution for the subspace process in dav-subspace. If this is efficient, can we just fix one solution method to reduce the burden of users?

The parallel diagonization is not always more efficient, it is related to the system size, parallel cores, and efficiency of parallel communication, etc. I have done some tests in #5480.

pxlxingliang commented 11 hours ago

I have a brief insight regarding this PR:

This process involves a transformation of parallel strategy.

  • from basis parallelism to 2D block parallelism
  • from 2D block parallelism to basis parallelism

Can this transformation of parallel strategy be more general?

If the H and S matrices of LCAO are solved by iterative method (like cg, dav, and so on, usually used for plane wave basis), it will also involve almost the same transformation of parallel strategy.

The transformation of different 2D block distribution can be realized easily by call the scalapack function Cpigemr2d() (a uniform interface for different data type is here: https://github.com/deepmodeling/abacus-develop/blob/develop/source/module_base/scalapack_connector.h#L158). While the transformation for basis parallelism is strongly related to the self-defined class in ABACUS (like psi?), may be a class function of psi to do the transformation is better.