Open xdzhu opened 3 days ago
EVEN in nspin=1 case, 3.7.5 also faces a big backstep of speed than 3.6.5
As you can see, 3.7.5 is:
when 3.6.5 gives:
When I set ks_solver scalapack_gvx
instead of genelpa
, the slow speed still remains:
3.7.5
3.6.5
@xdzhu What're your ABACUS installation dependencies?
I compared the time cost of these two versions. It seems arised from ESolver_KS_LCAO - runner
and HSolverLCAO - solve
modules.
@xdzhu What're your ABACUS installation dependencies?
Both with intel OneAPI 2023.1.0 and GCC 13.1.0.
3.6.5 with LibRI_0.1.0_loop3 3.7.5 with LibRI_0.2.0
I have noticed that in 3.7.x
version i take the icpx
and mpicxx
compilers instead of icpc
and mpiicpc
which I use to compile 3.6.5
version.
When I change the CXX and MPI_CXX to icpc
and mpiicpc
and recompile the 3.7.5
version, it goes faster than icpx
case and the peformance is also nearly same with the 3.6.5
version.
3.7.5 with icpc
3.7.5 with icpx
3.6.5 with icpc
@xdzhu What're your hardware setting?
@QuantumMisaka The calculation node hardware is with Intel(R) Xeon(R) Gold 6248 CPU @ 2.50GHz (2*20C), 40 cores, and I run ABACUS with following command: mpirun -np 10 -genv OMP_THREADS_NUM=4 abacus
Details
Recently, I perform SOC + EXX calculation. You can check the INPUT and output files in hse-3.6vs3.7-lowerspeed.zip
When I choose 3.6.5 version to calculate, the speed is OK. Evey PBE step costs 13s and EXX costs 178s. Although it faces the slower PBE speed between every EXX step.
When I change to 3.7.5, speed is very slow. Evey PBE step costs 43s and EXX costs 270s, which is twice than the 3.6.5 version above.
Task list for Issue attackers (only for developers)