Bugs: Different SCF with equivalent `STRU`

WHUweiqingzhou commented 5 months ago

Describe the bug

In Issue #3954, the user find with STRU1 and STRU2, ABACUS give different SCF. One of them cannot be converged:

STRU1:

Ni
1.0000000000
48
0.0000000000 0.0000000000 0.0000000000 1 1 1  
0.0000000000 0.3333330000 0.1666670000 1 1 1  
0.0000000000 0.1666670000 0.0833330000 1 1 1  
0.5000000000 0.4166670000 0.0833330000 1 1 1  
0.5000000000 0.0833330000 0.1666670000 1 1 1  
0.5000000000 0.2500000000 0.0000000000 1 1 1  
0.0000000000 0.5000000000 0.0000000000 1 1 1  
0.0000000000 0.8333330000 0.1666670000 1 1 1  
0.0000000000 0.6666670000 0.0833330000 1 1 1  
0.5000000000 0.9166670000 0.0833330000 1 1 1  
0.5000000000 0.5833330000 0.1666670000 1 1 1  
0.5000000000 0.7500000000 0.0000000000 1 1 1  
0.0000000000 0.0000000000 0.2500000000 1 1 1  
0.0000000000 0.3333330000 0.4166670000 1 1 1  
0.0000000000 0.1666670000 0.3333330000 1 1 1  
0.5000000000 0.4166670000 0.3333330000 1 1 1  
0.5000000000 0.0833330000 0.4166670000 1 1 1  
0.5000000000 0.2500000000 0.2500000000 1 1 1  
0.0000000000 0.5000000000 0.2500000000 1 1 1  
0.0000000000 0.8333330000 0.4166670000 1 1 1  
0.0000000000 0.6666670000 0.3333330000 1 1 1  
0.5000000000 0.9166670000 0.3333330000 1 1 1  
0.5000000000 0.5833330000 0.4166670000 1 1 1  
0.5000000000 0.7500000000 0.2500000000 1 1 1  
0.0000000000 0.0000000000 0.5000000000 1 1 1  
0.0000000000 0.3333330000 0.6666670000 1 1 1  
0.0000000000 0.1666670000 0.5833330000 1 1 1  
0.5000000000 0.4166670000 0.5833330000 1 1 1  
0.5000000000 0.0833330000 0.6666670000 1 1 1  
0.5000000000 0.2500000000 0.5000000000 1 1 1  
0.0000000000 0.5000000000 0.5000000000 1 1 1  
0.0000000000 0.8333330000 0.6666670000 1 1 1  
0.0000000000 0.6666670000 0.5833330000 1 1 1  
0.5000000000 0.9166670000 0.5833330000 1 1 1  
0.5000000000 0.5833330000 0.6666670000 1 1 1  
0.5000000000 0.7500000000 0.5000000000 1 1 1  
0.0000000000 0.0000000000 0.7500000000 1 1 1  
0.0000000000 0.3333330000 0.9166670000 1 1 1  
0.0000000000 0.1666670000 0.8333330000 1 1 1  
0.5000000000 0.4166670000 0.8333330000 1 1 1  
0.5000000000 0.0833330000 0.9166670000 1 1 1  
0.5000000000 0.2500000000 0.7500000000 1 1 1  
0.0000000000 0.5000000000 0.7500000000 1 1 1  
0.0000000000 0.8333330000 0.9166670000 1 1 1  
0.0000000000 0.6666670000 0.8333330000 1 1 1  
0.5000000000 0.9166670000 0.8333330000 1 1 1  
0.5000000000 0.5833330000 0.9166670000 1 1 1  
0.5000000000 0.7500000000 0.7500000000 1 1 1

STRU2:

Ni
0.0
48
0.0000000000 0.0000000000 0.0000000000 1 1 1 mag 1.0 
0.0000000000 0.3333330000 0.1666670000 1 1 1 mag 1.0 
0.0000000000 0.1666670000 0.0833330000 1 1 1 mag 1.0 
0.5000000000 0.4166670000 0.0833330000 1 1 1 mag 1.0 
0.5000000000 0.0833330000 0.1666670000 1 1 1 mag 1.0 
0.5000000000 0.2500000000 0.0000000000 1 1 1 mag 1.0 
0.0000000000 0.5000000000 0.0000000000 1 1 1 mag 1.0 
0.0000000000 0.8333330000 0.1666670000 1 1 1 mag 1.0 
0.0000000000 0.6666670000 0.0833330000 1 1 1 mag 1.0 
0.5000000000 0.9166670000 0.0833330000 1 1 1 mag 1.0 
0.5000000000 0.5833330000 0.1666670000 1 1 1 mag 1.0 
0.5000000000 0.7500000000 0.0000000000 1 1 1 mag 1.0 
0.0000000000 0.0000000000 0.2500000000 1 1 1 mag 1.0 
0.0000000000 0.3333330000 0.4166670000 1 1 1 mag 1.0 
0.0000000000 0.1666670000 0.3333330000 1 1 1 mag 1.0 
0.5000000000 0.4166670000 0.3333330000 1 1 1 mag 1.0 
0.5000000000 0.0833330000 0.4166670000 1 1 1 mag 1.0 
0.5000000000 0.2500000000 0.2500000000 1 1 1 mag 1.0 
0.0000000000 0.5000000000 0.2500000000 1 1 1 mag 1.0 
0.0000000000 0.8333330000 0.4166670000 1 1 1 mag 1.0 
0.0000000000 0.6666670000 0.3333330000 1 1 1 mag 1.0 
0.5000000000 0.9166670000 0.3333330000 1 1 1 mag 1.0 
0.5000000000 0.5833330000 0.4166670000 1 1 1 mag 1.0 
0.5000000000 0.7500000000 0.2500000000 1 1 1 mag 1.0 
0.0000000000 0.0000000000 0.5000000000 1 1 1 mag 1.0 
0.0000000000 0.3333330000 0.6666670000 1 1 1 mag 1.0 
0.0000000000 0.1666670000 0.5833330000 1 1 1 mag 1.0 
0.5000000000 0.4166670000 0.5833330000 1 1 1 mag 1.0 
0.5000000000 0.0833330000 0.6666670000 1 1 1 mag 1.0 
0.5000000000 0.2500000000 0.5000000000 1 1 1 mag 1.0 
0.0000000000 0.5000000000 0.5000000000 1 1 1 mag 1.0 
0.0000000000 0.8333330000 0.6666670000 1 1 1 mag 1.0 
0.0000000000 0.6666670000 0.5833330000 1 1 1 mag 1.0 
0.5000000000 0.9166670000 0.5833330000 1 1 1 mag 1.0 
0.5000000000 0.5833330000 0.6666670000 1 1 1 mag 1.0 
0.5000000000 0.7500000000 0.5000000000 1 1 1 mag 1.0 
0.0000000000 0.0000000000 0.7500000000 1 1 1 mag 1.0 
0.0000000000 0.3333330000 0.9166670000 1 1 1 mag 1.0 
0.0000000000 0.1666670000 0.8333330000 1 1 1 mag 1.0 
0.5000000000 0.4166670000 0.8333330000 1 1 1 mag 1.0 
0.5000000000 0.0833330000 0.9166670000 1 1 1 mag 1.0 
0.5000000000 0.2500000000 0.7500000000 1 1 1 mag 1.0 
0.0000000000 0.5000000000 0.7500000000 1 1 1 mag 1.0 
0.0000000000 0.8333330000 0.9166670000 1 1 1 mag 1.0 
0.0000000000 0.6666670000 0.8333330000 1 1 1 mag 1.0 
0.5000000000 0.9166670000 0.8333330000 1 1 1 mag 1.0 
0.5000000000 0.5833330000 0.9166670000 1 1 1 mag 1.0 
0.5000000000 0.7500000000 0.7500000000 1 1 1 mag 1.0

And the SCF look like:

see more detail in link

Expected behavior

No response

To Reproduce

No response

Environment

No response

Additional Context

No response

Task list for Issue attackers (only for developers)

[ ] Verify the issue is not a duplicate.
[ ] Describe the bug.
[ ] Steps to reproduce.
[ ] Expected behavior.
[ ] Error message.
[ ] Environment details.
[ ] Additional context.
[ ] Assign a priority level (low, medium, high, urgent).
[ ] Assign the issue to a team member.
[ ] Label the issue with relevant tags.
[ ] Identify possible related issues.
[ ] Create a unit test or automated test to reproduce the bug (if applicable).
[ ] Fix the bug.
[ ] Test the fix.
[ ] Update documentation (if necessary).
[ ] Close the issue and inform the reporter (if applicable).

WHUweiqingzhou commented 5 months ago

After many tests, I make sure that UnitCell::read_atom_positions works well. But in atomic_rho, STRU1 and STRU2 will result in different startmag_type if startmag_type==1, ABACUS will initialize the charge density by:

for (int ig = 0; ig < this->rhopw->npw; ig++)
{
    const std::complex<double> swap = strucFac(it, ig) * rho_lgl[this->rhopw->ig2igg[ig]];
    const double up = 0.5 * (1 + ucell.magnet.start_magnetization[it] / atom->ncpp.zv);
    const double dw = 0.5 * (1 - ucell.magnet.start_magnetization[it] / atom->ncpp.zv);
    rho_g3d(0, ig) += swap * up;
    rho_g3d(1, ig) += swap * dw;
    std::cout << strucFac(it, ig) << " " << rho_g3d(0, ig) << " " << rho_g3d(1, ig) << std::endl;
}

if startmag_type==2, ABACUS will initialize the charge density by:

std::complex<double> ci_tpi = ModuleBase::NEG_IMAG_UNIT * ModuleBase::TWO_PI;
for (int ia = 0; ia < atom->na; ia++)
{
    // const double up = 0.5 * ( 1 + atom->mag[ia] );
    // const double dw = 0.5 * ( 1 - atom->mag[ia] );
    const double up = 0.5 * (1 + atom->mag[ia] / atom->ncpp.zv);
    const double dw = 0.5 * (1 - atom->mag[ia] / atom->ncpp.zv);
    // std::cout << " atom " << ia << " up=" << up << " dw=" << dw << std::endl;
#ifdef _OPENMP
#pragma omp parallel for
#endif
    for (int ig = 0; ig < this->rhopw->npw; ig++)
    {
        const double Gtau = this->rhopw->gcar[ig][0] * atom->tau[ia].x
                            + this->rhopw->gcar[ig][1] * atom->tau[ia].y
                            + this->rhopw->gcar[ig][2] * atom->tau[ia].z;

        std::complex<double> swap
            = ModuleBase::libm::exp(ci_tpi * Gtau) * rho_lgl[this->rhopw->ig2igg[ig]];

        rho_g3d(0, ig) += swap * up;
        rho_g3d(1, ig) += swap * dw;
    }
}

I output rho_g3d for this example of different STRU, and find they are different. Any suggestion?

WHUweiqingzhou commented 5 months ago

I double-check it, and find the bug is not here.

WHUweiqingzhou commented 4 months ago

I make some calculations with different mixing_beta=0.4, 0.3, 0.2, and find only mixing_beta=0.4 with STRU1 converges.

see more in tests.

WHUweiqingzhou commented 4 months ago

I also try different MPI (OMP_NUM_THREADS=1 mpirun -np 16 & OMP_NUM_THREADS=2 mpirun -np 8) with STRU1 and STRU2, and the result is different. ABACUS shows obvious numerical instability in this system：

see more in link.

WHUweiqingzhou commented 4 months ago

@dyzheng @jinzx10 any idea?

jinzx10 commented 4 months ago

I agree that this might be another case that reveals numerical instability. If we zoom-in the right panel of drho in the original post, the first 3 steps actually agree very well; it is the 4th step where the two cases start to differ.

jinzx10 commented 2 months ago

Hi @WHUweiqingzhou , during the investigation of #4743 , I found that the following code snippets from module_base/module_mixing/broyden_mixing.cpp (line number 140-174) looks suspicious:

double* work = new double[ndim_cal_dF];
int* iwork = new int[ndim_cal_dF];
char uu = 'U';
int info;
dsytrf_(&uu, &ndim_cal_dF, beta_tmp.c, &ndim_cal_dF, iwork, work, &ndim_cal_dF, &info);
if (info != 0)
    ModuleBase::WARNING_QUIT("Charge_Mixing", "Error when factorizing beta.");
dsytri_(&uu, &ndim_cal_dF, beta_tmp.c, &ndim_cal_dF, iwork, work, &info);
if (info != 0)
    ModuleBase::WARNING_QUIT("Charge_Mixing", "Error when DSYTRI beta.");
for (int i = 0; i < ndim_cal_dF; ++i)
{
    for (int j = i + 1; j < ndim_cal_dF; ++j)
    {
        beta_tmp(i, j) = beta_tmp(j, i);
    }
}
for (int i = 0; i < ndim_cal_dF; ++i)
{
    FPTYPE* dFi = FP_dF + i * length;
    work[i] = inner_product(dFi, FP_F);
}
// gamma[i] = \sum_j beta_tmp(i,j) * work[j]
std::vector<double> gamma(ndim_cal_dF);
container::BlasConnector::gemv('N',
                               ndim_cal_dF,
                               ndim_cal_dF,
                               1.0,
                               beta_tmp.c,
                               ndim_cal_dF,
                               work,
                               1,
                               0.0,
                               gamma.data(),
                               1);

It seems to me that the above code is solving a linear equation Ax=b by computing inv(A) followed by a multiplication. While this procedure should behave as expected when A is a well-conditioned matrix, it could lead to severe numerical instability when A is poorly conditioned (i.e., error vectors have near linear dependency); dsysv should be preferred in this case (https://netlib.org/lapack/explore-html-3.6.1/d6/d0e/group__double_s_ysolve_gac0d0ad0edaa9d2014264c78874055db1.html).

WHUweiqingzhou commented 1 month ago

After PR #4842, the drho now is:

deepmodeling / abacus-develop