Open WHUweiqingzhou opened 5 months ago
After many tests, I make sure that UnitCell::read_atom_positions
works well.
But in atomic_rho
, STRU1
and STRU2
will result in different startmag_type
if startmag_type==1
, ABACUS will initialize the charge density by:
for (int ig = 0; ig < this->rhopw->npw; ig++)
{
const std::complex<double> swap = strucFac(it, ig) * rho_lgl[this->rhopw->ig2igg[ig]];
const double up = 0.5 * (1 + ucell.magnet.start_magnetization[it] / atom->ncpp.zv);
const double dw = 0.5 * (1 - ucell.magnet.start_magnetization[it] / atom->ncpp.zv);
rho_g3d(0, ig) += swap * up;
rho_g3d(1, ig) += swap * dw;
std::cout << strucFac(it, ig) << " " << rho_g3d(0, ig) << " " << rho_g3d(1, ig) << std::endl;
}
if startmag_type==2
, ABACUS will initialize the charge density by:
std::complex<double> ci_tpi = ModuleBase::NEG_IMAG_UNIT * ModuleBase::TWO_PI;
for (int ia = 0; ia < atom->na; ia++)
{
// const double up = 0.5 * ( 1 + atom->mag[ia] );
// const double dw = 0.5 * ( 1 - atom->mag[ia] );
const double up = 0.5 * (1 + atom->mag[ia] / atom->ncpp.zv);
const double dw = 0.5 * (1 - atom->mag[ia] / atom->ncpp.zv);
// std::cout << " atom " << ia << " up=" << up << " dw=" << dw << std::endl;
#ifdef _OPENMP
#pragma omp parallel for
#endif
for (int ig = 0; ig < this->rhopw->npw; ig++)
{
const double Gtau = this->rhopw->gcar[ig][0] * atom->tau[ia].x
+ this->rhopw->gcar[ig][1] * atom->tau[ia].y
+ this->rhopw->gcar[ig][2] * atom->tau[ia].z;
std::complex<double> swap
= ModuleBase::libm::exp(ci_tpi * Gtau) * rho_lgl[this->rhopw->ig2igg[ig]];
rho_g3d(0, ig) += swap * up;
rho_g3d(1, ig) += swap * dw;
}
}
I output rho_g3d
for this example of different STRU, and find they are different. Any suggestion?
I double-check it, and find the bug is not here.
I make some calculations with different mixing_beta=0.4, 0.3, 0.2
, and find only mixing_beta=0.4
with STRU1
converges.
see more in tests.
I also try different MPI (OMP_NUM_THREADS=1 mpirun -np 16 & OMP_NUM_THREADS=2 mpirun -np 8) with STRU1
and STRU2
, and the result is different. ABACUS shows obvious numerical instability in this system:
see more in link.
@dyzheng @jinzx10 any idea?
I agree that this might be another case that reveals numerical instability. If we zoom-in the right panel of drho in the original post, the first 3 steps actually agree very well; it is the 4th step where the two cases start to differ.
Hi @WHUweiqingzhou , during the investigation of #4743 , I found that the following code snippets from module_base/module_mixing/broyden_mixing.cpp (line number 140-174) looks suspicious:
double* work = new double[ndim_cal_dF];
int* iwork = new int[ndim_cal_dF];
char uu = 'U';
int info;
dsytrf_(&uu, &ndim_cal_dF, beta_tmp.c, &ndim_cal_dF, iwork, work, &ndim_cal_dF, &info);
if (info != 0)
ModuleBase::WARNING_QUIT("Charge_Mixing", "Error when factorizing beta.");
dsytri_(&uu, &ndim_cal_dF, beta_tmp.c, &ndim_cal_dF, iwork, work, &info);
if (info != 0)
ModuleBase::WARNING_QUIT("Charge_Mixing", "Error when DSYTRI beta.");
for (int i = 0; i < ndim_cal_dF; ++i)
{
for (int j = i + 1; j < ndim_cal_dF; ++j)
{
beta_tmp(i, j) = beta_tmp(j, i);
}
}
for (int i = 0; i < ndim_cal_dF; ++i)
{
FPTYPE* dFi = FP_dF + i * length;
work[i] = inner_product(dFi, FP_F);
}
// gamma[i] = \sum_j beta_tmp(i,j) * work[j]
std::vector<double> gamma(ndim_cal_dF);
container::BlasConnector::gemv('N',
ndim_cal_dF,
ndim_cal_dF,
1.0,
beta_tmp.c,
ndim_cal_dF,
work,
1,
0.0,
gamma.data(),
1);
It seems to me that the above code is solving a linear equation Ax=b by computing inv(A) followed by a multiplication. While this procedure should behave as expected when A is a well-conditioned matrix, it could lead to severe numerical instability when A is poorly conditioned (i.e., error vectors have near linear dependency); dsysv should be preferred in this case (https://netlib.org/lapack/explore-html-3.6.1/d6/d0e/group__double_s_ysolve_gac0d0ad0edaa9d2014264c78874055db1.html).
After PR #4842, the drho
now is:
Describe the bug
In Issue #3954, the user find with STRU1 and STRU2, ABACUS give different SCF. One of them cannot be converged:
STRU1:
STRU2:
And the SCF look like:
see more detail in link
Expected behavior
No response
To Reproduce
No response
Environment
No response
Additional Context
No response
Task list for Issue attackers (only for developers)