Closed Satinelamp closed 7 months ago
I will test this case, thanks for your report.
FYI, Here is the installation command of ABACUS I use.
Intel OneAPI:
wget https://registrationcenter-download.intel.com/akdlm/irc_nas/18236/l_BaseKit_p_2021.4.0.3422_offline.sh
bash l_BaseKit_p_2021.4.0.3422_offline.sh
wget https://registrationcenter-download.intel.com/akdlm/irc_nas/18211/l_HPCKit_p_2021.4.0.3347_offline.sh
bash l_HPCKit_p_2021.4.0.3347_offline.sh
source ~/intel/oneapi/setvars.sh
cmake:
wget https://github.com/Kitware/CMake/releases/download/v3.25.1/cmake-3.25.1-linux-x86_64.tar.gz
tar -zxvf cmake-3.25.1-linux-x86_64.tar.gz
export PATH=/home/alma/software/cmake-3.25.1-linux-x86_64/bin:${PATH}
ELPA:
下载ELPA
tar xzf elpa-2021.05.002.tar.gz
cd elpa-2021.05.002
mkdir build
cd build
CC=mpiicc CXX=mpiicpc FC=mpiifort ../configure --prefix=/home/alma/software/elpa_install FCFLAGS="-qmkl=cluster"
make -j9
make install
ln -s /home/alma/software/elpa_install/include/elpa-2021.05.002/elpa /home/alma/software/elpa_install/include/
Cereal:
git clone https://github.com/USCiLab/cereal.git \
LibXC:
tar xzf libxc-5.2.3.tar.gz
cd libxc-5.2.3
mkdir build
cmake -B build -DCMAKE_C_COMPILER=mpiicc -DCMAKE_INSTALL_PREFIX=/home/alma/software/libxc_install
cmake --build build
cmake --install build
ABACUS:
cd abacus-develop
cmake -DCMAKE_CXX_COMPILER=mpiicpc -DMPI_CXX_COMPILER=mpiicpc -B build -DCMAKE_INSTALL_PREFIX=~/software/abacus_install -DELPA_DIR=~/software/elpa_install -DCEREAL_INCLUDE_DIR=~/software/cereal/include -DLibxc_DIR=~/software/libxc_install
cmake --build build -j4
cmake --install build
ABACUS v3.6.0
Atomic-orbital Based Ab-initio Computation at UStc
Website: http://abacus.ustc.edu.cn/
Documentation: https://abacus.deepmodeling.com/
Repository: https://github.com/abacusmodeling/abacus-develop
https://github.com/deepmodeling/abacus-develop
Commit: 21d40ce93 (Tue Apr 9 22:54:12 2024 +0800)
Thu Apr 11 16:54:31 2024
MAKE THE DIR : OUT.autotest/
RUNNING WITH DEVICE : CPU / Intel(R) Xeon(R) Gold 6132 CPU @ 2.60GHz
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Warning: the number of valence electrons in pseudopotential > 10 for Ni: [Ar] 3d8 4s2
Warning: the number of valence electrons in pseudopotential > 4 for Ce: [Xe] 4f1 5d1 6s2
Pseudopotentials with additional electrons can yield (more) accurate outcomes, but may be less efficient.
If you're confident that your chosen pseudopotential is appropriate, you can safely ignore this warning.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
UNIFORM GRID DIM : 250 * 100 * 100
UNIFORM GRID DIM(BIG) : 250 * 100 * 100
DONE(0.0833994 SEC) : SETUP UNITCELL
DONE(0.113745 SEC) : SYMMETRY
DONE(0.198256 SEC) : INIT K-POINTS
---------------------------------------------------------
Self-consistent calculations for electrons
---------------------------------------------------------
SPIN KPOINTS PROCESSORS
1 1 16
---------------------------------------------------------
Use plane wave basis
---------------------------------------------------------
ELEMENT NATOM XC
H 2
O 72
Ni 1
Ce 35
---------------------------------------------------------
Initial plane wave basis and FFT box
---------------------------------------------------------
DONE(0.213358 SEC) : INIT PLANEWAVE
MEMORY FOR PSI (MB) : 65.8198
DONE(3.01542 SEC) : LOCAL POTENTIAL
DONE(3.11132 SEC) : NON-LOCAL POTENTIAL
DONE(3.14832 SEC) : INIT BASIS
-------------------------------------------
SELF-CONSISTENT :
-------------------------------------------
START CHARGE : atomic
DONE(6.56853 SEC) : INIT SCF
ITER ETOT(eV) EDIFF(eV) DRHO TIME(s)
DS1 -7.185066e+04 0.000000e+00 8.592e+01 1.444e+02
DS2 -7.196601e+04 -1.153577e+02 2.813e+02 5.443e+01
DS3 -7.221121e+04 -2.451923e+02 1.561e+02 3.661e+01
DS4 -7.235114e+04 -1.399348e+02 8.945e+00 3.490e+01
DS5 -7.236605e+04 -1.491242e+01 1.434e+00 3.892e+01
DS6 -7.236209e+04 3.959379e+00 5.053e+00 6.187e+01
DS7 -7.235742e+04 4.675447e+00 1.038e+01 3.606e+01
DS8 -7.236794e+04 -1.051882e+01 3.450e-01 3.554e+01
DS9 -7.236771e+04 2.240386e-01 1.736e-01 6.159e+01
DS10 -7.236768e+04 3.448583e-02 9.153e-02 4.203e+01
DS11 -7.236758e+04 9.438140e-02 5.162e-02 4.472e+01
DS12 -7.236759e+04 -1.089709e-02 1.819e-02 5.054e+01
DS13 -7.236758e+04 1.661006e-02 1.051e-02 5.609e+01
DS14 -7.236758e+04 -5.248819e-03 3.079e-03 6.842e+01
DS15 -7.236759e+04 -4.818104e-03 1.609e-03 6.036e+01
DS16 -7.236759e+04 -1.080650e-03 1.404e-03 3.462e+01
DS17 -7.236759e+04 -5.889374e-04 7.965e-04 3.538e+01
DS18 -7.236759e+04 -8.717323e-04 4.672e-04 3.965e+01
DS19 -7.236759e+04 -1.071941e-03 2.425e-04 4.071e+01
DS20 -7.236759e+04 -9.163324e-04 1.238e-04 3.490e+01
DS21 -7.236759e+04 -1.038639e-03 7.681e-05 3.492e+01
DS22 -7.236759e+04 -6.888046e-04 3.288e-05 3.483e+01
DS23 -7.236759e+04 -6.230731e-04 1.480e-05 4.095e+01
DS24 -7.236759e+04 -2.886406e-04 4.820e-06 4.401e+01
DS25 -7.236760e+04 -1.243220e-04 2.169e-06 4.395e+01
DS26 -7.236760e+04 -1.387674e-04 9.855e-07 6.622e+01
----------------------------------------------------------------
TOTAL-STRESS (KBAR)
----------------------------------------------------------------
-158.1040502706 0.3794187119 0.1579351324
0.3794187119 -158.0085337633 0.4517325805
0.1579351324 0.4517325805 -167.5477591445
----------------------------------------------------------------
TOTAL-PRESSURE: -161.220114 KBAR
TIME STATISTICS
-------------------------------------------------------------------------------------
CLASS_NAME NAME TIME(Sec) CALLS AVG(Sec) PER(%)
-------------------------------------------------------------------------------------
total 1307.59 17 76.92 100.00
Driver reading 0.01 1 0.01 0.00
Input Init 0.01 1 0.01 0.00
Input_Conv Convert 0.00 1 0.00 0.00
Driver driver_line 1307.58 1 1307.58 100.00
UnitCell check_tau 0.00 1 0.00 0.00
PW_Basis_Sup setuptransform 0.01 1 0.01 0.00
PW_Basis_Sup distributeg 0.01 1 0.01 0.00
mymath heapsort 0.01 147 0.00 0.00
Symmetry analy_sys 0.00 1 0.00 0.00
PW_Basis_K setuptransform 0.01 1 0.01 0.00
PW_Basis_K distributeg 0.01 1 0.01 0.00
PW_Basis setup_struc_factor 0.27 1 0.27 0.02
ppcell_vnl init 0.21 1 0.21 0.02
ppcell_vl init_vloc 2.31 1 2.31 0.18
ppcell_vnl init_vnl 0.10 1 0.10 0.01
WF_atomic init_at_1 0.04 1 0.04 0.00
wavefunc wfcinit 0.00 1 0.00 0.00
Ions opt_ions 1304.41 1 1304.41 99.76
ESolver_KS_PW run 1280.02 1 1280.02 97.89
H_Ewald_pw compute_ewald 0.01 1 0.01 0.00
Charge set_rho_core 0.00 1 0.00 0.00
Charge atomic_rho 2.57 1 2.57 0.20
PW_Basis_Sup recip2real 2.75 192 0.01 0.21
PW_Basis_Sup gathers_scatterp 1.03 192 0.01 0.08
Potential init_pot 0.29 1 0.29 0.02
Potential update_from_charge 7.08 27 0.26 0.54
Potential cal_fixed_v 0.02 1 0.02 0.00
PotLocal cal_fixed_v 0.02 1 0.02 0.00
Potential cal_v_eff 7.03 27 0.26 0.54
H_Hartree_pw v_hartree 0.96 27 0.04 0.07
PW_Basis_Sup real2recip 3.24 247 0.01 0.25
PW_Basis_Sup gatherp_scatters 1.23 247 0.00 0.09
PotXC cal_v_eff 5.98 27 0.22 0.46
XC_Functional v_xc 5.94 27 0.22 0.45
Potential interpolate_vrs 0.03 27 0.00 0.00
Symmetry rhog_symmetry 12.33 27 0.46 0.94
Symmetry group fft grids 7.41 27 0.27 0.57
Charge_Mixing init_mixing 0.00 1 0.00 0.00
ESolver_KS_PW hamilt2density 1268.04 26 48.77 96.97
HSolverPW solve 1254.09 26 48.23 95.91
Nonlocal getvnl 5.08 26 0.20 0.39
pp_cell_vnl getvnl 5.45 28 0.19 0.42
Structure_Factor get_sk 0.68 578 0.00 0.05
WF_atomic atomic_wfc 0.10 1 0.10 0.01
DiagoIterAssist diagH_subspace 12.32 1 12.32 0.94
Operator hPsi 676.02 307 2.20 51.70
Operator EkineticPW 5.54 307 0.02 0.42
Operator VeffPW 492.25 307 1.60 37.65
PW_Basis_K recip2real 348.28 47332 0.01 26.64
PW_Basis_K gathers_scatterp 51.81 47332 0.00 3.96
PW_Basis_K real2recip 155.26 33708 0.00 11.87
PW_Basis_K gatherp_scatters 23.32 33708 0.00 1.78
Operator NonlocalPW 178.22 307 0.58 13.63
Nonlocal add_nonlocal_pp 88.01 307 0.29 6.73
DiagoIterAssist diagH_LAPACK 0.43 1 0.43 0.03
Diago_DavSubspace diag_once 1115.35 26 42.90 85.30
Diago_DavSubspace first 296.27 26 11.40 22.66
Diago_DavSubspace cal_elem 112.49 306 0.37 8.60
Diago_DavSubspace diag_zhegvx 183.49 306 0.60 14.03
Diago_DavSubspace cal_grad 475.12 280 1.70 36.34
Diago_DavSubspace check_update 0.00 280 0.00 0.00
Diago_DavSubspace last 81.74 60 1.36 6.25
Diago_DavSubspace refresh 34.62 34 1.02 2.65
ElecStatePW psiToRho 119.78 26 4.61 9.16
Charge_Mixing get_drho 0.76 26 0.03 0.06
Charge_Mixing inner_product_recip_rho 0.03 26 0.00 0.00
Charge mix_rho 0.89 25 0.04 0.07
Charge Pulay_mixing 0.17 25 0.01 0.01
Charge_Mixing inner_product_recip_hartree 0.17 172 0.00 0.01
Forces cal_force_loc 0.27 1 0.27 0.02
Forces cal_force_ew 0.25 1 0.25 0.02
Forces cal_force_nl 4.43 1 4.43 0.34
Forces cal_force_cc 0.00 1 0.00 0.00
Forces cal_force_scc 2.92 1 2.92 0.22
Stress_PW cal_stress 16.52 1 16.52 1.26
Stress_Func stress_kin 0.14 1 0.14 0.01
Stress_Func stress_har 0.02 1 0.02 0.00
Stress_Func stress_ewa 0.27 1 0.27 0.02
Stress_Func stress_gga 0.12 1 0.12 0.01
Stress_Func stress_loc 3.32 1 3.32 0.25
Stress_Func stress_cc 0.00 1 0.00 0.00
Stress_Func stress_nl 12.56 1 12.56 0.96
ModuleIO write_istate_info 0.01 1 0.01 0.00
-------------------------------------------------------------------------------------
START Time : Thu Apr 11 16:54:31 2024
FINISH Time : Thu Apr 11 17:16:19 2024
TOTAL Time : 1308
SEE INFORMATION IN : OUT.autotest/
OMP_NUM_THREADS=1 mpirun -np 16 ~/abacus-develop/build/abacus
Does this efficiency seem normal? You can use dav_subspace
method for ks_solver for pw basis.
@Satinelamp
Details
I try to test the Pseudopotential of the Ce element with the pw calculation of abacus, but it seems each electronic step needs to take about 150s, but in vasp, it only takes 1s. I use 128 cores for abacus, 64 cores for vasp.
Here is the link of the pseudopotential of Ce element: https://github.com/simonpintarelli/lanthanides-nc-pseudos/blob/master/Ce/PROJECT-Ce.UPF-1.upf
Here is the input file I use: input.zip
Here is the output of abacus: