deepmodeling / abacus-develop

An electronic structure package based on either plane wave basis or numerical atomic orbitals.
http://abacus.ustc.edu.cn
GNU Lesser General Public License v3.0
174 stars 136 forks source link

the pw calculation of Ce element is very slow #2636

Closed Satinelamp closed 7 months ago

Satinelamp commented 1 year ago

Details

I try to test the Pseudopotential of the Ce element with the pw calculation of abacus, but it seems each electronic step needs to take about 150s, but in vasp, it only takes 1s. I use 128 cores for abacus, 64 cores for vasp.

Here is the link of the pseudopotential of Ce element: https://github.com/simonpintarelli/lanthanides-nc-pseudos/blob/master/Ce/PROJECT-Ce.UPF-1.upf

Here is the input file I use: input.zip

Here is the output of abacus:


                              ABACUS v3.2.4

               Atomic-orbital Based Ab-initio Computation at UStc                    

                     Website: http://abacus.ustc.edu.cn/                             
               Documentation: https://abacus.deepmodeling.com/                       
                  Repository: https://github.com/abacusmodeling/abacus-develop       
                              https://github.com/deepmodeling/abacus-develop         
                      Commit: unknown

 Wed Jun 14 16:54:36 2023
 MAKE THE DIR         : OUT.autotest/

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 Warning: the number of valence electrons in pseudopotential > 10 for Ni: [Ar] 3d8 4s2
 Warning: the number of valence electrons in pseudopotential > 4 for Ce: [Xe] 4f1 5d1 6s2
 Pseudopotentials with additional electrons can yield (more) accurate outcomes, but may be less efficient.
 If you're confident that your chosen pseudopotential is appropriate, you can safely ignore this warning.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

 UNIFORM GRID DIM     : 250 * 100 * 100
 UNIFORM GRID DIM(BIG): 250 * 100 * 100
 DONE(0.286461   SEC) : SETUP UNITCELL
 DONE(0.3386     SEC) : SYMMETRY
 DONE(0.606937   SEC) : INIT K-POINTS
 ---------------------------------------------------------
 Self-consistent calculations for electrons
 ---------------------------------------------------------
 SPIN    KPOINTS         PROCESSORS  
 1       1               64          
 ---------------------------------------------------------
 Use plane wave basis
 ---------------------------------------------------------
 ELEMENT NATOM       XC          
 H       2           
 O       72          
 Ni      1           
 Ce      35          
 ---------------------------------------------------------
 Initial plane wave basis and FFT box
 ---------------------------------------------------------
 DONE(0.660592   SEC) : INIT PLANEWAVE
 MEMORY FOR PSI (MB)  : 16.4709
 DONE(0.984355   SEC) : LOCAL POTENTIAL
 DONE(1.06864    SEC) : NON-LOCAL POTENTIAL
 DONE(1.1041     SEC) : INIT BASIS
 -------------------------------------------
 SELF-CONSISTENT : 
 -------------------------------------------
 START CHARGE      : atomic
 DONE(1.93276    SEC) : INIT SCF
 ITER   ETOT(eV)       EDIFF(eV)      DRHO       TIME(s)    
 CG1    -7.196932e+04  0.000000e+00   5.963e+01  1.668e+02  
 CG2    -7.157562e+04  3.937039e+02   2.072e+02  7.949e+01  
 CG3    -6.707738e+04  4.498234e+03   9.652e+03  1.452e+02  
 CG4    -7.037669e+04  -3.299304e+03  6.335e+03  2.833e+02  
 CG5    -7.242059e+04  -2.043899e+03  2.003e+01  3.244e+02  
 CG6    -7.196953e+04  4.510587e+02   4.318e+03  4.077e+02  
dyzheng commented 1 year ago

I will test this case, thanks for your report.

Satinelamp commented 1 year ago

FYI, Here is the installation command of ABACUS I use.

Intel OneAPI:
wget https://registrationcenter-download.intel.com/akdlm/irc_nas/18236/l_BaseKit_p_2021.4.0.3422_offline.sh
bash l_BaseKit_p_2021.4.0.3422_offline.sh
wget https://registrationcenter-download.intel.com/akdlm/irc_nas/18211/l_HPCKit_p_2021.4.0.3347_offline.sh
bash l_HPCKit_p_2021.4.0.3347_offline.sh
source ~/intel/oneapi/setvars.sh

cmake:
wget https://github.com/Kitware/CMake/releases/download/v3.25.1/cmake-3.25.1-linux-x86_64.tar.gz
tar -zxvf cmake-3.25.1-linux-x86_64.tar.gz
export PATH=/home/alma/software/cmake-3.25.1-linux-x86_64/bin:${PATH}

ELPA:
下载ELPA
tar xzf elpa-2021.05.002.tar.gz
cd elpa-2021.05.002 
mkdir build 
cd build 
CC=mpiicc CXX=mpiicpc FC=mpiifort ../configure --prefix=/home/alma/software/elpa_install FCFLAGS="-qmkl=cluster"
make -j9 
make install
ln -s /home/alma/software/elpa_install/include/elpa-2021.05.002/elpa /home/alma/software/elpa_install/include/ 

Cereal:
git clone https://github.com/USCiLab/cereal.git \

LibXC:
tar xzf libxc-5.2.3.tar.gz
cd libxc-5.2.3
mkdir build
cmake -B build -DCMAKE_C_COMPILER=mpiicc -DCMAKE_INSTALL_PREFIX=/home/alma/software/libxc_install
cmake --build build
cmake --install build

ABACUS:
cd abacus-develop
cmake -DCMAKE_CXX_COMPILER=mpiicpc -DMPI_CXX_COMPILER=mpiicpc -B build -DCMAKE_INSTALL_PREFIX=~/software/abacus_install -DELPA_DIR=~/software/elpa_install -DCEREAL_INCLUDE_DIR=~/software/cereal/include -DLibxc_DIR=~/software/libxc_install

cmake --build build -j4
cmake --install build
hongriTianqi commented 1 year ago
haozhihan commented 7 months ago
                              ABACUS v3.6.0

               Atomic-orbital Based Ab-initio Computation at UStc                    

                     Website: http://abacus.ustc.edu.cn/                             
               Documentation: https://abacus.deepmodeling.com/                       
                  Repository: https://github.com/abacusmodeling/abacus-develop       
                              https://github.com/deepmodeling/abacus-develop         
                      Commit: 21d40ce93 (Tue Apr 9 22:54:12 2024 +0800)

 Thu Apr 11 16:54:31 2024
 MAKE THE DIR         : OUT.autotest/
 RUNNING WITH DEVICE  : CPU / Intel(R) Xeon(R) Gold 6132 CPU @ 2.60GHz

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 Warning: the number of valence electrons in pseudopotential > 10 for Ni: [Ar] 3d8 4s2
 Warning: the number of valence electrons in pseudopotential > 4 for Ce: [Xe] 4f1 5d1 6s2
 Pseudopotentials with additional electrons can yield (more) accurate outcomes, but may be less efficient.
 If you're confident that your chosen pseudopotential is appropriate, you can safely ignore this warning.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

 UNIFORM GRID DIM        : 250 * 100 * 100
 UNIFORM GRID DIM(BIG)   : 250 * 100 * 100
 DONE(0.0833994  SEC) : SETUP UNITCELL
 DONE(0.113745   SEC) : SYMMETRY
 DONE(0.198256   SEC) : INIT K-POINTS
 ---------------------------------------------------------
 Self-consistent calculations for electrons
 ---------------------------------------------------------
 SPIN    KPOINTS         PROCESSORS  
 1       1               16          
 ---------------------------------------------------------
 Use plane wave basis
 ---------------------------------------------------------
 ELEMENT NATOM       XC          
 H       2           
 O       72          
 Ni      1           
 Ce      35          
 ---------------------------------------------------------
 Initial plane wave basis and FFT box
 ---------------------------------------------------------
 DONE(0.213358   SEC) : INIT PLANEWAVE
 MEMORY FOR PSI (MB)  : 65.8198
 DONE(3.01542    SEC) : LOCAL POTENTIAL
 DONE(3.11132    SEC) : NON-LOCAL POTENTIAL
 DONE(3.14832    SEC) : INIT BASIS
 -------------------------------------------
 SELF-CONSISTENT : 
 -------------------------------------------
 START CHARGE      : atomic
 DONE(6.56853    SEC) : INIT SCF
 ITER   ETOT(eV)       EDIFF(eV)      DRHO       TIME(s)    
 DS1    -7.185066e+04  0.000000e+00   8.592e+01  1.444e+02  
 DS2    -7.196601e+04  -1.153577e+02  2.813e+02  5.443e+01  
 DS3    -7.221121e+04  -2.451923e+02  1.561e+02  3.661e+01  
 DS4    -7.235114e+04  -1.399348e+02  8.945e+00  3.490e+01  
 DS5    -7.236605e+04  -1.491242e+01  1.434e+00  3.892e+01  
 DS6    -7.236209e+04  3.959379e+00   5.053e+00  6.187e+01  
 DS7    -7.235742e+04  4.675447e+00   1.038e+01  3.606e+01  
 DS8    -7.236794e+04  -1.051882e+01  3.450e-01  3.554e+01  
 DS9    -7.236771e+04  2.240386e-01   1.736e-01  6.159e+01  
 DS10   -7.236768e+04  3.448583e-02   9.153e-02  4.203e+01  
 DS11   -7.236758e+04  9.438140e-02   5.162e-02  4.472e+01  
 DS12   -7.236759e+04  -1.089709e-02  1.819e-02  5.054e+01  
 DS13   -7.236758e+04  1.661006e-02   1.051e-02  5.609e+01  
 DS14   -7.236758e+04  -5.248819e-03  3.079e-03  6.842e+01  
 DS15   -7.236759e+04  -4.818104e-03  1.609e-03  6.036e+01  
 DS16   -7.236759e+04  -1.080650e-03  1.404e-03  3.462e+01  
 DS17   -7.236759e+04  -5.889374e-04  7.965e-04  3.538e+01  
 DS18   -7.236759e+04  -8.717323e-04  4.672e-04  3.965e+01  
 DS19   -7.236759e+04  -1.071941e-03  2.425e-04  4.071e+01  
 DS20   -7.236759e+04  -9.163324e-04  1.238e-04  3.490e+01  
 DS21   -7.236759e+04  -1.038639e-03  7.681e-05  3.492e+01  
 DS22   -7.236759e+04  -6.888046e-04  3.288e-05  3.483e+01  
 DS23   -7.236759e+04  -6.230731e-04  1.480e-05  4.095e+01  
 DS24   -7.236759e+04  -2.886406e-04  4.820e-06  4.401e+01  
 DS25   -7.236760e+04  -1.243220e-04  2.169e-06  4.395e+01  
 DS26   -7.236760e+04  -1.387674e-04  9.855e-07  6.622e+01  
----------------------------------------------------------------
TOTAL-STRESS (KBAR)                                           
----------------------------------------------------------------
     -158.1040502706         0.3794187119         0.1579351324
        0.3794187119      -158.0085337633         0.4517325805
        0.1579351324         0.4517325805      -167.5477591445
----------------------------------------------------------------
 TOTAL-PRESSURE: -161.220114 KBAR

TIME STATISTICS
-------------------------------------------------------------------------------------
     CLASS_NAME                 NAME             TIME(Sec)  CALLS   AVG(Sec) PER(%)
-------------------------------------------------------------------------------------
                     total                       1307.59         17  76.92   100.00
Driver               reading                       0.01           1   0.01     0.00
Input                Init                          0.01           1   0.01     0.00
Input_Conv           Convert                       0.00           1   0.00     0.00
Driver               driver_line                 1307.58          1 1307.58  100.00
UnitCell             check_tau                     0.00           1   0.00     0.00
PW_Basis_Sup         setuptransform                0.01           1   0.01     0.00
PW_Basis_Sup         distributeg                   0.01           1   0.01     0.00
mymath               heapsort                      0.01         147   0.00     0.00
Symmetry             analy_sys                     0.00           1   0.00     0.00
PW_Basis_K           setuptransform                0.01           1   0.01     0.00
PW_Basis_K           distributeg                   0.01           1   0.01     0.00
PW_Basis             setup_struc_factor            0.27           1   0.27     0.02
ppcell_vnl           init                          0.21           1   0.21     0.02
ppcell_vl            init_vloc                     2.31           1   2.31     0.18
ppcell_vnl           init_vnl                      0.10           1   0.10     0.01
WF_atomic            init_at_1                     0.04           1   0.04     0.00
wavefunc             wfcinit                       0.00           1   0.00     0.00
Ions                 opt_ions                    1304.41          1 1304.41   99.76
ESolver_KS_PW        run                         1280.02          1 1280.02   97.89
H_Ewald_pw           compute_ewald                 0.01           1   0.01     0.00
Charge               set_rho_core                  0.00           1   0.00     0.00
Charge               atomic_rho                    2.57           1   2.57     0.20
PW_Basis_Sup         recip2real                    2.75         192   0.01     0.21
PW_Basis_Sup         gathers_scatterp              1.03         192   0.01     0.08
Potential            init_pot                      0.29           1   0.29     0.02
Potential            update_from_charge            7.08          27   0.26     0.54
Potential            cal_fixed_v                   0.02           1   0.02     0.00
PotLocal             cal_fixed_v                   0.02           1   0.02     0.00
Potential            cal_v_eff                     7.03          27   0.26     0.54
H_Hartree_pw         v_hartree                     0.96          27   0.04     0.07
PW_Basis_Sup         real2recip                    3.24         247   0.01     0.25
PW_Basis_Sup         gatherp_scatters              1.23         247   0.00     0.09
PotXC                cal_v_eff                     5.98          27   0.22     0.46
XC_Functional        v_xc                          5.94          27   0.22     0.45
Potential            interpolate_vrs               0.03          27   0.00     0.00
Symmetry             rhog_symmetry                12.33          27   0.46     0.94
Symmetry             group fft grids               7.41          27   0.27     0.57
Charge_Mixing        init_mixing                   0.00           1   0.00     0.00
ESolver_KS_PW        hamilt2density              1268.04         26  48.77    96.97
HSolverPW            solve                       1254.09         26  48.23    95.91
Nonlocal             getvnl                        5.08          26   0.20     0.39
pp_cell_vnl          getvnl                        5.45          28   0.19     0.42
Structure_Factor     get_sk                        0.68         578   0.00     0.05
WF_atomic            atomic_wfc                    0.10           1   0.10     0.01
DiagoIterAssist      diagH_subspace               12.32           1  12.32     0.94
Operator             hPsi                        676.02         307   2.20    51.70
Operator             EkineticPW                    5.54         307   0.02     0.42
Operator             VeffPW                      492.25         307   1.60    37.65
PW_Basis_K           recip2real                  348.28       47332   0.01    26.64
PW_Basis_K           gathers_scatterp             51.81       47332   0.00     3.96
PW_Basis_K           real2recip                  155.26       33708   0.00    11.87
PW_Basis_K           gatherp_scatters             23.32       33708   0.00     1.78
Operator             NonlocalPW                  178.22         307   0.58    13.63
Nonlocal             add_nonlocal_pp              88.01         307   0.29     6.73
DiagoIterAssist      diagH_LAPACK                  0.43           1   0.43     0.03
Diago_DavSubspace    diag_once                   1115.35         26  42.90    85.30
Diago_DavSubspace    first                       296.27          26  11.40    22.66
Diago_DavSubspace    cal_elem                    112.49         306   0.37     8.60
Diago_DavSubspace    diag_zhegvx                 183.49         306   0.60    14.03
Diago_DavSubspace    cal_grad                    475.12         280   1.70    36.34
Diago_DavSubspace    check_update                  0.00         280   0.00     0.00
Diago_DavSubspace    last                         81.74          60   1.36     6.25
Diago_DavSubspace    refresh                      34.62          34   1.02     2.65
ElecStatePW          psiToRho                    119.78          26   4.61     9.16
Charge_Mixing        get_drho                      0.76          26   0.03     0.06
Charge_Mixing        inner_product_recip_rho       0.03          26   0.00     0.00
Charge               mix_rho                       0.89          25   0.04     0.07
Charge               Pulay_mixing                  0.17          25   0.01     0.01
Charge_Mixing        inner_product_recip_hartree   0.17         172   0.00     0.01
Forces               cal_force_loc                 0.27           1   0.27     0.02
Forces               cal_force_ew                  0.25           1   0.25     0.02
Forces               cal_force_nl                  4.43           1   4.43     0.34
Forces               cal_force_cc                  0.00           1   0.00     0.00
Forces               cal_force_scc                 2.92           1   2.92     0.22
Stress_PW            cal_stress                   16.52           1  16.52     1.26
Stress_Func          stress_kin                    0.14           1   0.14     0.01
Stress_Func          stress_har                    0.02           1   0.02     0.00
Stress_Func          stress_ewa                    0.27           1   0.27     0.02
Stress_Func          stress_gga                    0.12           1   0.12     0.01
Stress_Func          stress_loc                    3.32           1   3.32     0.25
Stress_Func          stress_cc                     0.00           1   0.00     0.00
Stress_Func          stress_nl                    12.56           1  12.56     0.96
ModuleIO             write_istate_info             0.01           1   0.01     0.00
-------------------------------------------------------------------------------------

 START  Time  : Thu Apr 11 16:54:31 2024
 FINISH Time  : Thu Apr 11 17:16:19 2024
 TOTAL  Time  : 1308
 SEE INFORMATION IN : OUT.autotest/

OMP_NUM_THREADS=1 mpirun -np 16 ~/abacus-develop/build/abacus

Does this efficiency seem normal? You can use dav_subspace method for ks_solver for pw basis. @Satinelamp

image