deepmodeling / abacus-develop

An electronic structure package based on either plane wave basis or numerical atomic orbitals.
http://abacus.ustc.edu.cn
GNU Lesser General Public License v3.0
165 stars 129 forks source link

Bugs: Different SCF with equivalent `STRU` #4058

Open WHUweiqingzhou opened 5 months ago

WHUweiqingzhou commented 5 months ago

Describe the bug

In Issue #3954, the user find with STRU1 and STRU2, ABACUS give different SCF. One of them cannot be converged:

STRU1:

Ni
1.0000000000
48
0.0000000000 0.0000000000 0.0000000000 1 1 1  
0.0000000000 0.3333330000 0.1666670000 1 1 1  
0.0000000000 0.1666670000 0.0833330000 1 1 1  
0.5000000000 0.4166670000 0.0833330000 1 1 1  
0.5000000000 0.0833330000 0.1666670000 1 1 1  
0.5000000000 0.2500000000 0.0000000000 1 1 1  
0.0000000000 0.5000000000 0.0000000000 1 1 1  
0.0000000000 0.8333330000 0.1666670000 1 1 1  
0.0000000000 0.6666670000 0.0833330000 1 1 1  
0.5000000000 0.9166670000 0.0833330000 1 1 1  
0.5000000000 0.5833330000 0.1666670000 1 1 1  
0.5000000000 0.7500000000 0.0000000000 1 1 1  
0.0000000000 0.0000000000 0.2500000000 1 1 1  
0.0000000000 0.3333330000 0.4166670000 1 1 1  
0.0000000000 0.1666670000 0.3333330000 1 1 1  
0.5000000000 0.4166670000 0.3333330000 1 1 1  
0.5000000000 0.0833330000 0.4166670000 1 1 1  
0.5000000000 0.2500000000 0.2500000000 1 1 1  
0.0000000000 0.5000000000 0.2500000000 1 1 1  
0.0000000000 0.8333330000 0.4166670000 1 1 1  
0.0000000000 0.6666670000 0.3333330000 1 1 1  
0.5000000000 0.9166670000 0.3333330000 1 1 1  
0.5000000000 0.5833330000 0.4166670000 1 1 1  
0.5000000000 0.7500000000 0.2500000000 1 1 1  
0.0000000000 0.0000000000 0.5000000000 1 1 1  
0.0000000000 0.3333330000 0.6666670000 1 1 1  
0.0000000000 0.1666670000 0.5833330000 1 1 1  
0.5000000000 0.4166670000 0.5833330000 1 1 1  
0.5000000000 0.0833330000 0.6666670000 1 1 1  
0.5000000000 0.2500000000 0.5000000000 1 1 1  
0.0000000000 0.5000000000 0.5000000000 1 1 1  
0.0000000000 0.8333330000 0.6666670000 1 1 1  
0.0000000000 0.6666670000 0.5833330000 1 1 1  
0.5000000000 0.9166670000 0.5833330000 1 1 1  
0.5000000000 0.5833330000 0.6666670000 1 1 1  
0.5000000000 0.7500000000 0.5000000000 1 1 1  
0.0000000000 0.0000000000 0.7500000000 1 1 1  
0.0000000000 0.3333330000 0.9166670000 1 1 1  
0.0000000000 0.1666670000 0.8333330000 1 1 1  
0.5000000000 0.4166670000 0.8333330000 1 1 1  
0.5000000000 0.0833330000 0.9166670000 1 1 1  
0.5000000000 0.2500000000 0.7500000000 1 1 1  
0.0000000000 0.5000000000 0.7500000000 1 1 1  
0.0000000000 0.8333330000 0.9166670000 1 1 1  
0.0000000000 0.6666670000 0.8333330000 1 1 1  
0.5000000000 0.9166670000 0.8333330000 1 1 1  
0.5000000000 0.5833330000 0.9166670000 1 1 1  
0.5000000000 0.7500000000 0.7500000000 1 1 1  

STRU2:

Ni
0.0
48
0.0000000000 0.0000000000 0.0000000000 1 1 1 mag 1.0 
0.0000000000 0.3333330000 0.1666670000 1 1 1 mag 1.0 
0.0000000000 0.1666670000 0.0833330000 1 1 1 mag 1.0 
0.5000000000 0.4166670000 0.0833330000 1 1 1 mag 1.0 
0.5000000000 0.0833330000 0.1666670000 1 1 1 mag 1.0 
0.5000000000 0.2500000000 0.0000000000 1 1 1 mag 1.0 
0.0000000000 0.5000000000 0.0000000000 1 1 1 mag 1.0 
0.0000000000 0.8333330000 0.1666670000 1 1 1 mag 1.0 
0.0000000000 0.6666670000 0.0833330000 1 1 1 mag 1.0 
0.5000000000 0.9166670000 0.0833330000 1 1 1 mag 1.0 
0.5000000000 0.5833330000 0.1666670000 1 1 1 mag 1.0 
0.5000000000 0.7500000000 0.0000000000 1 1 1 mag 1.0 
0.0000000000 0.0000000000 0.2500000000 1 1 1 mag 1.0 
0.0000000000 0.3333330000 0.4166670000 1 1 1 mag 1.0 
0.0000000000 0.1666670000 0.3333330000 1 1 1 mag 1.0 
0.5000000000 0.4166670000 0.3333330000 1 1 1 mag 1.0 
0.5000000000 0.0833330000 0.4166670000 1 1 1 mag 1.0 
0.5000000000 0.2500000000 0.2500000000 1 1 1 mag 1.0 
0.0000000000 0.5000000000 0.2500000000 1 1 1 mag 1.0 
0.0000000000 0.8333330000 0.4166670000 1 1 1 mag 1.0 
0.0000000000 0.6666670000 0.3333330000 1 1 1 mag 1.0 
0.5000000000 0.9166670000 0.3333330000 1 1 1 mag 1.0 
0.5000000000 0.5833330000 0.4166670000 1 1 1 mag 1.0 
0.5000000000 0.7500000000 0.2500000000 1 1 1 mag 1.0 
0.0000000000 0.0000000000 0.5000000000 1 1 1 mag 1.0 
0.0000000000 0.3333330000 0.6666670000 1 1 1 mag 1.0 
0.0000000000 0.1666670000 0.5833330000 1 1 1 mag 1.0 
0.5000000000 0.4166670000 0.5833330000 1 1 1 mag 1.0 
0.5000000000 0.0833330000 0.6666670000 1 1 1 mag 1.0 
0.5000000000 0.2500000000 0.5000000000 1 1 1 mag 1.0 
0.0000000000 0.5000000000 0.5000000000 1 1 1 mag 1.0 
0.0000000000 0.8333330000 0.6666670000 1 1 1 mag 1.0 
0.0000000000 0.6666670000 0.5833330000 1 1 1 mag 1.0 
0.5000000000 0.9166670000 0.5833330000 1 1 1 mag 1.0 
0.5000000000 0.5833330000 0.6666670000 1 1 1 mag 1.0 
0.5000000000 0.7500000000 0.5000000000 1 1 1 mag 1.0 
0.0000000000 0.0000000000 0.7500000000 1 1 1 mag 1.0 
0.0000000000 0.3333330000 0.9166670000 1 1 1 mag 1.0 
0.0000000000 0.1666670000 0.8333330000 1 1 1 mag 1.0 
0.5000000000 0.4166670000 0.8333330000 1 1 1 mag 1.0 
0.5000000000 0.0833330000 0.9166670000 1 1 1 mag 1.0 
0.5000000000 0.2500000000 0.7500000000 1 1 1 mag 1.0 
0.0000000000 0.5000000000 0.7500000000 1 1 1 mag 1.0 
0.0000000000 0.8333330000 0.9166670000 1 1 1 mag 1.0 
0.0000000000 0.6666670000 0.8333330000 1 1 1 mag 1.0 
0.5000000000 0.9166670000 0.8333330000 1 1 1 mag 1.0 
0.5000000000 0.5833330000 0.9166670000 1 1 1 mag 1.0 
0.5000000000 0.7500000000 0.7500000000 1 1 1 mag 1.0 

And the SCF look like: image

see more detail in link

Expected behavior

No response

To Reproduce

No response

Environment

No response

Additional Context

No response

Task list for Issue attackers (only for developers)

WHUweiqingzhou commented 5 months ago

After many tests, I make sure that UnitCell::read_atom_positions works well. But in atomic_rho, STRU1 and STRU2 will result in different startmag_type if startmag_type==1, ABACUS will initialize the charge density by:

for (int ig = 0; ig < this->rhopw->npw; ig++)
{
    const std::complex<double> swap = strucFac(it, ig) * rho_lgl[this->rhopw->ig2igg[ig]];
    const double up = 0.5 * (1 + ucell.magnet.start_magnetization[it] / atom->ncpp.zv);
    const double dw = 0.5 * (1 - ucell.magnet.start_magnetization[it] / atom->ncpp.zv);
    rho_g3d(0, ig) += swap * up;
    rho_g3d(1, ig) += swap * dw;
    std::cout << strucFac(it, ig) << " " << rho_g3d(0, ig) << " " << rho_g3d(1, ig) << std::endl;
}

if startmag_type==2, ABACUS will initialize the charge density by:

std::complex<double> ci_tpi = ModuleBase::NEG_IMAG_UNIT * ModuleBase::TWO_PI;
for (int ia = 0; ia < atom->na; ia++)
{
    // const double up = 0.5 * ( 1 + atom->mag[ia] );
    // const double dw = 0.5 * ( 1 - atom->mag[ia] );
    const double up = 0.5 * (1 + atom->mag[ia] / atom->ncpp.zv);
    const double dw = 0.5 * (1 - atom->mag[ia] / atom->ncpp.zv);
    // std::cout << " atom " << ia << " up=" << up << " dw=" << dw << std::endl;
#ifdef _OPENMP
#pragma omp parallel for
#endif
    for (int ig = 0; ig < this->rhopw->npw; ig++)
    {
        const double Gtau = this->rhopw->gcar[ig][0] * atom->tau[ia].x
                            + this->rhopw->gcar[ig][1] * atom->tau[ia].y
                            + this->rhopw->gcar[ig][2] * atom->tau[ia].z;

        std::complex<double> swap
            = ModuleBase::libm::exp(ci_tpi * Gtau) * rho_lgl[this->rhopw->ig2igg[ig]];

        rho_g3d(0, ig) += swap * up;
        rho_g3d(1, ig) += swap * dw;
    }
}

I output rho_g3d for this example of different STRU, and find they are different. Any suggestion?

WHUweiqingzhou commented 5 months ago

I double-check it, and find the bug is not here.

WHUweiqingzhou commented 4 months ago

I make some calculations with different mixing_beta=0.4, 0.3, 0.2, and find only mixing_beta=0.4 with STRU1 converges. image

see more in tests.

WHUweiqingzhou commented 4 months ago

I also try different MPI (OMP_NUM_THREADS=1 mpirun -np 16 & OMP_NUM_THREADS=2 mpirun -np 8) with STRU1 and STRU2, and the result is different. ABACUS shows obvious numerical instability in this system: image

see more in link.

WHUweiqingzhou commented 4 months ago

@dyzheng @jinzx10 any idea?

jinzx10 commented 4 months ago

I agree that this might be another case that reveals numerical instability. If we zoom-in the right panel of drho in the original post, the first 3 steps actually agree very well; it is the 4th step where the two cases start to differ.

image

image

jinzx10 commented 2 months ago

Hi @WHUweiqingzhou , during the investigation of #4743 , I found that the following code snippets from module_base/module_mixing/broyden_mixing.cpp (line number 140-174) looks suspicious:

double* work = new double[ndim_cal_dF];
int* iwork = new int[ndim_cal_dF];
char uu = 'U';
int info;
dsytrf_(&uu, &ndim_cal_dF, beta_tmp.c, &ndim_cal_dF, iwork, work, &ndim_cal_dF, &info);
if (info != 0)
    ModuleBase::WARNING_QUIT("Charge_Mixing", "Error when factorizing beta.");
dsytri_(&uu, &ndim_cal_dF, beta_tmp.c, &ndim_cal_dF, iwork, work, &info);
if (info != 0)
    ModuleBase::WARNING_QUIT("Charge_Mixing", "Error when DSYTRI beta.");
for (int i = 0; i < ndim_cal_dF; ++i)
{
    for (int j = i + 1; j < ndim_cal_dF; ++j)
    {
        beta_tmp(i, j) = beta_tmp(j, i);
    }
}
for (int i = 0; i < ndim_cal_dF; ++i)
{
    FPTYPE* dFi = FP_dF + i * length;
    work[i] = inner_product(dFi, FP_F);
}
// gamma[i] = \sum_j beta_tmp(i,j) * work[j]
std::vector<double> gamma(ndim_cal_dF);
container::BlasConnector::gemv('N',
                               ndim_cal_dF,
                               ndim_cal_dF,
                               1.0,
                               beta_tmp.c,
                               ndim_cal_dF,
                               work,
                               1,
                               0.0,
                               gamma.data(),
                               1);

It seems to me that the above code is solving a linear equation Ax=b by computing inv(A) followed by a multiplication. While this procedure should behave as expected when A is a well-conditioned matrix, it could lead to severe numerical instability when A is poorly conditioned (i.e., error vectors have near linear dependency); dsysv should be preferred in this case (https://netlib.org/lapack/explore-html-3.6.1/d6/d0e/group__double_s_ysolve_gac0d0ad0edaa9d2014264c78874055db1.html).

WHUweiqingzhou commented 1 month ago

After PR #4842, the drho now is: image