Open kirk0830 opened 1 month ago
I tested 601_NO_TDDFT_CO on my workstation, and it seems another problem was caused by hsolver.
@lyb9812 can you look into this issue?
AddressSanitizer:DEADLYSIGNAL
=================================================================
==2700559==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000010 (pc 0x7433824e138f bp 0x000000000001 sp 0x7ffea2907b00 T0)
==2700559==The signal is caused by a READ memory access.
==2700559==Hint: address points to the zero page.
#0 0x7433824e138f in zdotc_ (/opt/intel/oneapi/mkl/2022.2.0/lib/intel64/libmkl_intel_lp64.so.2+0x2e138f)
#1 0x5e0050c17066 in pzpotf2_ (/home/liuyu/github/abacus-develop/build/abacus+0x18c8066)
#2 0x5e0050ba798c in pzpotrf_ (/home/liuyu/github/abacus-develop/build/abacus+0x185898c)
#3 0x5e0050ba322d in pzhegvx_ (/home/liuyu/github/abacus-develop/build/abacus+0x185422d)
#4 0x5e004ffc9bd2 in hsolver::DiagoScalapack<std::complex<double> >::pzhegvx_once(int const*, int, int, std::complex<double> const*, std::complex<double> const*, double*, psi::Psi<std::complex<double>, base_device::DEVICE_CPU>&) const /home/liuyu/github/abacus-develop/source/module_hsolver/diago_scalapack.cpp:285
#5 0x5e004ffccb0d in hsolver::DiagoScalapack<std::complex<double> >::pzhegvx_diag(int const*, int, int, std::complex<double> const*, std::complex<double> const*, double*, psi::Psi<std::complex<double>, base_device::DEVICE_CPU>&) /home/liuyu/github/abacus-develop/source/module_hsolver/diago_scalapack.cpp:371
#6 0x5e004ffb45ba in hsolver::DiagoScalapack<std::complex<double> >::diag(hamilt::Hamilt<std::complex<double>, base_device::DEVICE_CPU>*, psi::Psi<std::complex<double>, base_device::DEVICE_CPU>&, double*) /home/liuyu/github/abacus-develop/source/module_hsolver/diago_scalapack.cpp:44
#7 0x5e004ffae0e9 in hsolver::HSolverLCAO<std::complex<double>, base_device::DEVICE_CPU>::hamiltSolvePsiK(hamilt::Hamilt<std::complex<double>, base_device::DEVICE_CPU>*, psi::Psi<std::complex<double>, base_device::DEVICE_CPU>&, double*) /home/liuyu/github/abacus-develop/source/module_hsolver/hsolver_lcao.cpp:135
#8 0x5e004ffb1ea6 in hsolver::HSolverLCAO<std::complex<double>, base_device::DEVICE_CPU>::solve(hamilt::Hamilt<std::complex<double>, base_device::DEVICE_CPU>*, psi::Psi<std::complex<double>, base_device::DEVICE_CPU>&, elecstate::ElecState*, bool) /home/liuyu/github/abacus-develop/source/module_hsolver/hsolver_lcao.cpp:107
#9 0x5e00503fdcf7 in ModuleESolver::ESolver_KS_LCAO_TDDFT::hamilt2density(int, int, double) /home/liuyu/github/abacus-develop/source/module_esolver/esolver_ks_lcao_tddft.cpp:172
#10 0x5e00502ee7ff in ModuleESolver::ESolver_KS<std::complex<double>, base_device::DEVICE_CPU>::runner(int, UnitCell&) /home/liuyu/github/abacus-develop/source/module_esolver/esolver_ks.cpp:474
#11 0x5e004f8cada4 in MD_func::force_virial(ModuleESolver::ESolver*, int const&, UnitCell&, double&, ModuleBase::Vector3<double>*, bool const&, ModuleBase::matrix&) /home/liuyu/github/abacus-develop/source/module_md/md_func.cpp:258
#12 0x5e004f8c3d06 in MD_base::setup(ModuleESolver::ESolver*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /home/liuyu/github/abacus-develop/source/module_md/md_base.cpp:65
#13 0x5e004f8f5752 in Verlet::setup(ModuleESolver::ESolver*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /home/liuyu/github/abacus-develop/source/module_md/verlet.cpp:20
#14 0x5e004f8f01d0 in Run_MD::md_line(UnitCell&, ModuleESolver::ESolver*, Parameter const&) /home/liuyu/github/abacus-develop/source/module_md/run_md.cpp:54
#15 0x5e004fe95d05 in Driver::driver_run() /home/liuyu/github/abacus-develop/source/driver_run.cpp:63
#16 0x5e004fe90030 in Driver::atomic_world() /home/liuyu/github/abacus-develop/source/driver.cpp:186
#17 0x5e004fe94998 in Driver::init() /home/liuyu/github/abacus-develop/source/driver.cpp:40
#18 0x5e004f5d7be9 in main /home/liuyu/github/abacus-develop/source/main.cpp:42
#19 0x743379629d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
#20 0x743379629e3f in __libc_start_main_impl ../csu/libc-start.c:392
#21 0x5e004f60b754 in _start (/home/liuyu/github/abacus-develop/build/abacus+0x2bc754)
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV (/opt/intel/oneapi/mkl/2022.2.0/lib/intel64/libmkl_intel_lp64.so.2+0x2e138f) in zdotc_
==2700559==ABORTING
I tested 601_NO_TDDFT_CO on my workstation, and it seems another problem was caused by hsolver.
@lyb9812 can you look into this issue?
AddressSanitizer:DEADLYSIGNAL ================================================================= ==2700559==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000010 (pc 0x7433824e138f bp 0x000000000001 sp 0x7ffea2907b00 T0) ==2700559==The signal is caused by a READ memory access. ==2700559==Hint: address points to the zero page. #0 0x7433824e138f in zdotc_ (/opt/intel/oneapi/mkl/2022.2.0/lib/intel64/libmkl_intel_lp64.so.2+0x2e138f) #1 0x5e0050c17066 in pzpotf2_ (/home/liuyu/github/abacus-develop/build/abacus+0x18c8066) #2 0x5e0050ba798c in pzpotrf_ (/home/liuyu/github/abacus-develop/build/abacus+0x185898c) #3 0x5e0050ba322d in pzhegvx_ (/home/liuyu/github/abacus-develop/build/abacus+0x185422d) #4 0x5e004ffc9bd2 in hsolver::DiagoScalapack<std::complex<double> >::pzhegvx_once(int const*, int, int, std::complex<double> const*, std::complex<double> const*, double*, psi::Psi<std::complex<double>, base_device::DEVICE_CPU>&) const /home/liuyu/github/abacus-develop/source/module_hsolver/diago_scalapack.cpp:285 #5 0x5e004ffccb0d in hsolver::DiagoScalapack<std::complex<double> >::pzhegvx_diag(int const*, int, int, std::complex<double> const*, std::complex<double> const*, double*, psi::Psi<std::complex<double>, base_device::DEVICE_CPU>&) /home/liuyu/github/abacus-develop/source/module_hsolver/diago_scalapack.cpp:371 #6 0x5e004ffb45ba in hsolver::DiagoScalapack<std::complex<double> >::diag(hamilt::Hamilt<std::complex<double>, base_device::DEVICE_CPU>*, psi::Psi<std::complex<double>, base_device::DEVICE_CPU>&, double*) /home/liuyu/github/abacus-develop/source/module_hsolver/diago_scalapack.cpp:44 #7 0x5e004ffae0e9 in hsolver::HSolverLCAO<std::complex<double>, base_device::DEVICE_CPU>::hamiltSolvePsiK(hamilt::Hamilt<std::complex<double>, base_device::DEVICE_CPU>*, psi::Psi<std::complex<double>, base_device::DEVICE_CPU>&, double*) /home/liuyu/github/abacus-develop/source/module_hsolver/hsolver_lcao.cpp:135 #8 0x5e004ffb1ea6 in hsolver::HSolverLCAO<std::complex<double>, base_device::DEVICE_CPU>::solve(hamilt::Hamilt<std::complex<double>, base_device::DEVICE_CPU>*, psi::Psi<std::complex<double>, base_device::DEVICE_CPU>&, elecstate::ElecState*, bool) /home/liuyu/github/abacus-develop/source/module_hsolver/hsolver_lcao.cpp:107 #9 0x5e00503fdcf7 in ModuleESolver::ESolver_KS_LCAO_TDDFT::hamilt2density(int, int, double) /home/liuyu/github/abacus-develop/source/module_esolver/esolver_ks_lcao_tddft.cpp:172 #10 0x5e00502ee7ff in ModuleESolver::ESolver_KS<std::complex<double>, base_device::DEVICE_CPU>::runner(int, UnitCell&) /home/liuyu/github/abacus-develop/source/module_esolver/esolver_ks.cpp:474 #11 0x5e004f8cada4 in MD_func::force_virial(ModuleESolver::ESolver*, int const&, UnitCell&, double&, ModuleBase::Vector3<double>*, bool const&, ModuleBase::matrix&) /home/liuyu/github/abacus-develop/source/module_md/md_func.cpp:258 #12 0x5e004f8c3d06 in MD_base::setup(ModuleESolver::ESolver*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /home/liuyu/github/abacus-develop/source/module_md/md_base.cpp:65 #13 0x5e004f8f5752 in Verlet::setup(ModuleESolver::ESolver*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /home/liuyu/github/abacus-develop/source/module_md/verlet.cpp:20 #14 0x5e004f8f01d0 in Run_MD::md_line(UnitCell&, ModuleESolver::ESolver*, Parameter const&) /home/liuyu/github/abacus-develop/source/module_md/run_md.cpp:54 #15 0x5e004fe95d05 in Driver::driver_run() /home/liuyu/github/abacus-develop/source/driver_run.cpp:63 #16 0x5e004fe90030 in Driver::atomic_world() /home/liuyu/github/abacus-develop/source/driver.cpp:186 #17 0x5e004fe94998 in Driver::init() /home/liuyu/github/abacus-develop/source/driver.cpp:40 #18 0x5e004f5d7be9 in main /home/liuyu/github/abacus-develop/source/main.cpp:42 #19 0x743379629d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 #20 0x743379629e3f in __libc_start_main_impl ../csu/libc-start.c:392 #21 0x5e004f60b754 in _start (/home/liuyu/github/abacus-develop/build/abacus+0x2bc754) AddressSanitizer can not provide additional info. SUMMARY: AddressSanitizer: SEGV (/opt/intel/oneapi/mkl/2022.2.0/lib/intel64/libmkl_intel_lp64.so.2+0x2e138f) in zdotc_ ==2700559==ABORTING
Commit 15449beee4e78a44abf02b09b17344acf2994cb1 (PR #3681) is ok.
Commit bfe2925877609f6c6e2c8f0b4c4020833ae8a9ba caused this bug (PR #3623).
I cannot reproduce this issue on my workstation, @kirk0830 could you please check which PR introduces this bug?
@YuLiu98 I will check it asap
Describe the bug
Test case 601_NO_TDDFT_CO
On process id asan.101435
Test case 601_NO_TDDFT_CO_occ
On process id asan.101552
Test case 601_NO_TDDFT_graphene_kpoint
On process id asan.101669
Test case 601_NO_TDDFT_graphene_kpoint
On process id asan.101669
Expected behavior
No response
To Reproduce
No response
Environment
No response
Additional Context
No response
Task list for Issue attackers (only for developers)