deepmodeling / abacus-develop

An electronic structure package based on either plane wave basis or numerical atomic orbitals.
http://abacus.ustc.edu.cn
GNU Lesser General Public License v3.0
174 stars 136 forks source link

Bug: memory leak detected in integrated test cases 601_NO_TDDFT_CO, 601_NO_TDDFT_CO_occ, 601_NO_TDDFT_graphene_kpoint #5229

Open kirk0830 opened 1 month ago

kirk0830 commented 1 month ago

Describe the bug

Test case 601_NO_TDDFT_CO

On process id asan.101435

=================================================================
==101435==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x604000114740 at pc 0x5565d8400555 bp 0x7ffc6b2794a0 sp 0x7ffc6b279490
READ of size 8 at 0x604000114740 thread T0
    #0 0x5565d8400554 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::size() const /usr/include/c++/11/bits/basic_string.h:921
    #1 0x5565d8400554 in std::basic_ostream<char, std::char_traits<char> >& std::operator<< <char, std::char_traits<char>, std::allocator<char> >(std::basic_ostream<char, std::char_traits<char> >&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /usr/include/c++/11/bits/basic_string.h:6536
    #2 0x5565d8400554 in MD_func::dump_info(int const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, UnitCell const&, Parameter const&, ModuleBase::matrix const&, ModuleBase::Vector3<double> const*, ModuleBase::Vector3<double> const*) /__w/abacus-develop/abacus-develop/source/module_md/md_func.cpp:385
    #3 0x5565d841ec9a in Run_MD::md_line(UnitCell&, ModuleESolver::ESolver*, Parameter const&) /__w/abacus-develop/abacus-develop/source/module_md/run_md.cpp:89
    #4 0x5565d89d84a7 in Driver::driver_run() /__w/abacus-develop/abacus-develop/source/driver_run.cpp:63
    #5 0x5565d89d1bbe in Driver::atomic_world() /__w/abacus-develop/abacus-develop/source/driver.cpp:186
    #6 0x5565d89d6f36 in Driver::init() /__w/abacus-develop/abacus-develop/source/driver.cpp:40
    #7 0x5565d7fe731a in main /__w/abacus-develop/abacus-develop/source/main.cpp:42
    #8 0x7fbf24b48d8f  (/lib/x86_64-linux-gnu/libc.so.6+0x29d8f)
    #9 0x7fbf24b48e3f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x29e3f)
    #10 0x5565d801e5a4 in _start (/usr/local/bin/abacus+0x29c5a4)

0x604000114740 is located 8 bytes to the right of 40-byte region [0x604000114710,0x604000114738)
allocated by thread T0 here:
    #0 0x7fbf3fb97357 in operator new[](unsigned long) ../../../../src/libsanitizer/asan/asan_new_delete.cpp:102
    #1 0x5565d82b0533 in UnitCell::UnitCell() /__w/abacus-develop/abacus-develop/source/module_cell/unitcell.cpp:64
    #2 0x5565d800e405 in __static_initialization_and_destruction_0 /__w/abacus-develop/abacus-develop/source/module_hamilt_pw/hamilt_pwdft/global.cpp:11
    #3 0x5565d800e405 in _GLOBAL__sub_I__ZN7GlobalC6ppcellE /__w/abacus-develop/abacus-develop/source/module_hamilt_pw/hamilt_pwdft/global.cpp:15
    #4 0x7fbf24b48eba in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x29eba)

SUMMARY: AddressSanitizer: heap-buffer-overflow /usr/include/c++/11/bits/basic_string.h:921 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::size() const
Shadow bytes around the buggy address:
  0x0c088001a890: fa fa 00 00 00 00 00 04 fa fa 00 00 00 00 00 04
  0x0c088001a8a0: fa fa 00 00 00 00 00 04 fa fa 00 00 00 00 00 04
  0x0c088001a8b0: fa fa 00 00 00 00 00 00 fa fa 00 00 00 00 00 00
  0x0c088001a8c0: fa fa 00 00 00 00 00 00 fa fa 00 00 00 00 00 00
  0x0c088001a8d0: fa fa 00 00 00 00 00 00 fa fa 00 00 00 00 00 00
=>0x0c088001a8e0: fa fa 00 00 00 00 00 fa[fa]fa 00 00 00 00 00 fa
  0x0c088001a8f0: fa fa 00 00 00 00 00 fa fa fa 00 00 00 00 00 fa
  0x0c088001a900: fa fa 00 00 00 00 00 fa fa fa 00 00 00 00 00 00
  0x0c088001a910: fa fa 00 00 00 00 00 fa fa fa 00 00 00 00 00 00
  0x0c088001a920: fa fa fd fd fd fd fd fa fa fa 00 00 00 00 06 fa
  0x0c088001a930: fa fa fd fd fd fd fd fa fa fa 00 00 00 00 00 02
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
==101435==ABORTING

Test case 601_NO_TDDFT_CO_occ

On process id asan.101552

=================================================================
==101552==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x604000114740 at pc 0x555f78191555 bp 0x7ffcef8b4540 sp 0x7ffcef8b4530
READ of size 8 at 0x604000114740 thread T0
    #0 0x555f78191554 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::size() const /usr/include/c++/11/bits/basic_string.h:921
    #1 0x555f78191554 in std::basic_ostream<char, std::char_traits<char> >& std::operator<< <char, std::char_traits<char>, std::allocator<char> >(std::basic_ostream<char, std::char_traits<char> >&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /usr/include/c++/11/bits/basic_string.h:6536
    #2 0x555f78191554 in MD_func::dump_info(int const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, UnitCell const&, Parameter const&, ModuleBase::matrix const&, ModuleBase::Vector3<double> const*, ModuleBase::Vector3<double> const*) /__w/abacus-develop/abacus-develop/source/module_md/md_func.cpp:385
    #3 0x555f781afc9a in Run_MD::md_line(UnitCell&, ModuleESolver::ESolver*, Parameter const&) /__w/abacus-develop/abacus-develop/source/module_md/run_md.cpp:89
    #4 0x555f787694a7 in Driver::driver_run() /__w/abacus-develop/abacus-develop/source/driver_run.cpp:63
    #5 0x555f78762bbe in Driver::atomic_world() /__w/abacus-develop/abacus-develop/source/driver.cpp:186
    #6 0x555f78767f36 in Driver::init() /__w/abacus-develop/abacus-develop/source/driver.cpp:40
    #7 0x555f77d7831a in main /__w/abacus-develop/abacus-develop/source/main.cpp:42
    #8 0x7f1b63800d8f  (/lib/x86_64-linux-gnu/libc.so.6+0x29d8f)
    #9 0x7f1b63800e3f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x29e3f)
    #10 0x555f77daf5a4 in _start (/usr/local/bin/abacus+0x29c5a4)

0x604000114740 is located 8 bytes to the right of 40-byte region [0x604000114710,0x604000114738)
allocated by thread T0 here:
    #0 0x7f1b7e84b357 in operator new[](unsigned long) ../../../../src/libsanitizer/asan/asan_new_delete.cpp:102
    #1 0x555f78041533 in UnitCell::UnitCell() /__w/abacus-develop/abacus-develop/source/module_cell/unitcell.cpp:64
    #2 0x555f77d9f405 in __static_initialization_and_destruction_0 /__w/abacus-develop/abacus-develop/source/module_hamilt_pw/hamilt_pwdft/global.cpp:11
    #3 0x555f77d9f405 in _GLOBAL__sub_I__ZN7GlobalC6ppcellE /__w/abacus-develop/abacus-develop/source/module_hamilt_pw/hamilt_pwdft/global.cpp:15
    #4 0x7f1b63800eba in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x29eba)

SUMMARY: AddressSanitizer: heap-buffer-overflow /usr/include/c++/11/bits/basic_string.h:921 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::size() const
Shadow bytes around the buggy address:
  0x0c088001a890: fa fa 00 00 00 00 00 04 fa fa 00 00 00 00 00 04
  0x0c088001a8a0: fa fa 00 00 00 00 00 04 fa fa 00 00 00 00 00 04
  0x0c088001a8b0: fa fa 00 00 00 00 00 00 fa fa 00 00 00 00 00 00
  0x0c088001a8c0: fa fa 00 00 00 00 00 00 fa fa 00 00 00 00 00 00
  0x0c088001a8d0: fa fa 00 00 00 00 00 00 fa fa 00 00 00 00 00 00
=>0x0c088001a8e0: fa fa 00 00 00 00 00 fa[fa]fa 00 00 00 00 00 fa
  0x0c088001a8f0: fa fa 00 00 00 00 00 fa fa fa 00 00 00 00 00 fa
  0x0c088001a900: fa fa 00 00 00 00 00 fa fa fa 00 00 00 00 00 00
  0x0c088001a910: fa fa 00 00 00 00 00 fa fa fa 00 00 00 00 00 00
  0x0c088001a920: fa fa fd fd fd fd fd fa fa fa 00 00 00 00 06 fa
  0x0c088001a930: fa fa fd fd fd fd fd fa fa fa 00 00 00 00 00 02
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
==101552==ABORTING

Test case 601_NO_TDDFT_graphene_kpoint

On process id asan.101669

=================================================================
==101669==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x604000114740 at pc 0x557e16625555 bp 0x7ffdee1d9f20 sp 0x7ffdee1d9f10
READ of size 8 at 0x604000114740 thread T0
    #0 0x557e16625554 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::size() const /usr/include/c++/11/bits/basic_string.h:921
    #1 0x557e16625554 in std::basic_ostream<char, std::char_traits<char> >& std::operator<< <char, std::char_traits<char>, std::allocator<char> >(std::basic_ostream<char, std::char_traits<char> >&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /usr/include/c++/11/bits/basic_string.h:6536
    #2 0x557e16625554 in MD_func::dump_info(int const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, UnitCell const&, Parameter const&, ModuleBase::matrix const&, ModuleBase::Vector3<double> const*, ModuleBase::Vector3<double> const*) /__w/abacus-develop/abacus-develop/source/module_md/md_func.cpp:385
    #3 0x557e16643c9a in Run_MD::md_line(UnitCell&, ModuleESolver::ESolver*, Parameter const&) /__w/abacus-develop/abacus-develop/source/module_md/run_md.cpp:89
    #4 0x557e16bfd4a7 in Driver::driver_run() /__w/abacus-develop/abacus-develop/source/driver_run.cpp:63
    #5 0x557e16bf6bbe in Driver::atomic_world() /__w/abacus-develop/abacus-develop/source/driver.cpp:186
    #6 0x557e16bfbf36 in Driver::init() /__w/abacus-develop/abacus-develop/source/driver.cpp:40
    #7 0x557e1620c31a in main /__w/abacus-develop/abacus-develop/source/main.cpp:42
    #8 0x7f1e23338d8f  (/lib/x86_64-linux-gnu/libc.so.6+0x29d8f)
    #9 0x7f1e23338e3f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x29e3f)
    #10 0x557e162435a4 in _start (/usr/local/bin/abacus+0x29c5a4)

0x604000114740 is located 8 bytes to the right of 40-byte region [0x604000114710,0x604000114738)
allocated by thread T0 here:
    #0 0x7f1e3e385357 in operator new[](unsigned long) ../../../../src/libsanitizer/asan/asan_new_delete.cpp:102
    #1 0x557e164d5533 in UnitCell::UnitCell() /__w/abacus-develop/abacus-develop/source/module_cell/unitcell.cpp:64
    #2 0x557e16233405 in __static_initialization_and_destruction_0 /__w/abacus-develop/abacus-develop/source/module_hamilt_pw/hamilt_pwdft/global.cpp:11
    #3 0x557e16233405 in _GLOBAL__sub_I__ZN7GlobalC6ppcellE /__w/abacus-develop/abacus-develop/source/module_hamilt_pw/hamilt_pwdft/global.cpp:15
    #4 0x7f1e23338eba in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x29eba)

SUMMARY: AddressSanitizer: heap-buffer-overflow /usr/include/c++/11/bits/basic_string.h:921 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::size() const
Shadow bytes around the buggy address:
  0x0c088001a890: fa fa 00 00 00 00 00 04 fa fa 00 00 00 00 00 04
  0x0c088001a8a0: fa fa 00 00 00 00 00 04 fa fa 00 00 00 00 00 04
  0x0c088001a8b0: fa fa 00 00 00 00 00 00 fa fa 00 00 00 00 00 00
  0x0c088001a8c0: fa fa 00 00 00 00 00 00 fa fa 00 00 00 00 00 00
  0x0c088001a8d0: fa fa 00 00 00 00 00 00 fa fa 00 00 00 00 00 00
=>0x0c088001a8e0: fa fa 00 00 00 00 00 fa[fa]fa 00 00 00 00 00 fa
  0x0c088001a8f0: fa fa 00 00 00 00 00 fa fa fa 00 00 00 00 00 fa
  0x0c088001a900: fa fa 00 00 00 00 00 fa fa fa 00 00 00 00 00 00
  0x0c088001a910: fa fa 00 00 00 00 00 fa fa fa 00 00 00 00 00 00
  0x0c088001a920: fa fa fd fd fd fd fd fa fa fa 00 00 00 00 06 fa
  0x0c088001a930: fa fa fd fd fd fd fd fa fa fa 00 00 00 00 00 02
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
==101669==ABORTING

Test case 601_NO_TDDFT_graphene_kpoint

On process id asan.101669

=================================================================
==101669==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x604000114740 at pc 0x557e16625555 bp 0x7ffdee1d9f20 sp 0x7ffdee1d9f10
READ of size 8 at 0x604000114740 thread T0
    #0 0x557e16625554 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::size() const /usr/include/c++/11/bits/basic_string.h:921
    #1 0x557e16625554 in std::basic_ostream<char, std::char_traits<char> >& std::operator<< <char, std::char_traits<char>, std::allocator<char> >(std::basic_ostream<char, std::char_traits<char> >&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /usr/include/c++/11/bits/basic_string.h:6536
    #2 0x557e16625554 in MD_func::dump_info(int const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, UnitCell const&, Parameter const&, ModuleBase::matrix const&, ModuleBase::Vector3<double> const*, ModuleBase::Vector3<double> const*) /__w/abacus-develop/abacus-develop/source/module_md/md_func.cpp:385
    #3 0x557e16643c9a in Run_MD::md_line(UnitCell&, ModuleESolver::ESolver*, Parameter const&) /__w/abacus-develop/abacus-develop/source/module_md/run_md.cpp:89
    #4 0x557e16bfd4a7 in Driver::driver_run() /__w/abacus-develop/abacus-develop/source/driver_run.cpp:63
    #5 0x557e16bf6bbe in Driver::atomic_world() /__w/abacus-develop/abacus-develop/source/driver.cpp:186
    #6 0x557e16bfbf36 in Driver::init() /__w/abacus-develop/abacus-develop/source/driver.cpp:40
    #7 0x557e1620c31a in main /__w/abacus-develop/abacus-develop/source/main.cpp:42
    #8 0x7f1e23338d8f  (/lib/x86_64-linux-gnu/libc.so.6+0x29d8f)
    #9 0x7f1e23338e3f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x29e3f)
    #10 0x557e162435a4 in _start (/usr/local/bin/abacus+0x29c5a4)

0x604000114740 is located 8 bytes to the right of 40-byte region [0x604000114710,0x604000114738)
allocated by thread T0 here:
    #0 0x7f1e3e385357 in operator new[](unsigned long) ../../../../src/libsanitizer/asan/asan_new_delete.cpp:102
    #1 0x557e164d5533 in UnitCell::UnitCell() /__w/abacus-develop/abacus-develop/source/module_cell/unitcell.cpp:64
    #2 0x557e16233405 in __static_initialization_and_destruction_0 /__w/abacus-develop/abacus-develop/source/module_hamilt_pw/hamilt_pwdft/global.cpp:11
    #3 0x557e16233405 in _GLOBAL__sub_I__ZN7GlobalC6ppcellE /__w/abacus-develop/abacus-develop/source/module_hamilt_pw/hamilt_pwdft/global.cpp:15
    #4 0x7f1e23338eba in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x29eba)

SUMMARY: AddressSanitizer: heap-buffer-overflow /usr/include/c++/11/bits/basic_string.h:921 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::size() const
Shadow bytes around the buggy address:
  0x0c088001a890: fa fa 00 00 00 00 00 04 fa fa 00 00 00 00 00 04
  0x0c088001a8a0: fa fa 00 00 00 00 00 04 fa fa 00 00 00 00 00 04
  0x0c088001a8b0: fa fa 00 00 00 00 00 00 fa fa 00 00 00 00 00 00
  0x0c088001a8c0: fa fa 00 00 00 00 00 00 fa fa 00 00 00 00 00 00
  0x0c088001a8d0: fa fa 00 00 00 00 00 00 fa fa 00 00 00 00 00 00
=>0x0c088001a8e0: fa fa 00 00 00 00 00 fa[fa]fa 00 00 00 00 00 fa
  0x0c088001a8f0: fa fa 00 00 00 00 00 fa fa fa 00 00 00 00 00 fa
  0x0c088001a900: fa fa 00 00 00 00 00 fa fa fa 00 00 00 00 00 00
  0x0c088001a910: fa fa 00 00 00 00 00 fa fa fa 00 00 00 00 00 00
  0x0c088001a920: fa fa fd fd fd fd fd fa fa fa 00 00 00 00 06 fa
  0x0c088001a930: fa fa fd fd fd fd fd fa fa fa 00 00 00 00 00 02
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
==101669==ABORTING

Expected behavior

No response

To Reproduce

No response

Environment

No response

Additional Context

No response

Task list for Issue attackers (only for developers)

YuLiu98 commented 1 month ago

I tested 601_NO_TDDFT_CO on my workstation, and it seems another problem was caused by hsolver.

@lyb9812 can you look into this issue?

AddressSanitizer:DEADLYSIGNAL
=================================================================
==2700559==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000010 (pc 0x7433824e138f bp 0x000000000001 sp 0x7ffea2907b00 T0)
==2700559==The signal is caused by a READ memory access.
==2700559==Hint: address points to the zero page.
    #0 0x7433824e138f in zdotc_ (/opt/intel/oneapi/mkl/2022.2.0/lib/intel64/libmkl_intel_lp64.so.2+0x2e138f)
    #1 0x5e0050c17066 in pzpotf2_ (/home/liuyu/github/abacus-develop/build/abacus+0x18c8066)
    #2 0x5e0050ba798c in pzpotrf_ (/home/liuyu/github/abacus-develop/build/abacus+0x185898c)
    #3 0x5e0050ba322d in pzhegvx_ (/home/liuyu/github/abacus-develop/build/abacus+0x185422d)
    #4 0x5e004ffc9bd2 in hsolver::DiagoScalapack<std::complex<double> >::pzhegvx_once(int const*, int, int, std::complex<double> const*, std::complex<double> const*, double*, psi::Psi<std::complex<double>, base_device::DEVICE_CPU>&) const /home/liuyu/github/abacus-develop/source/module_hsolver/diago_scalapack.cpp:285
    #5 0x5e004ffccb0d in hsolver::DiagoScalapack<std::complex<double> >::pzhegvx_diag(int const*, int, int, std::complex<double> const*, std::complex<double> const*, double*, psi::Psi<std::complex<double>, base_device::DEVICE_CPU>&) /home/liuyu/github/abacus-develop/source/module_hsolver/diago_scalapack.cpp:371
    #6 0x5e004ffb45ba in hsolver::DiagoScalapack<std::complex<double> >::diag(hamilt::Hamilt<std::complex<double>, base_device::DEVICE_CPU>*, psi::Psi<std::complex<double>, base_device::DEVICE_CPU>&, double*) /home/liuyu/github/abacus-develop/source/module_hsolver/diago_scalapack.cpp:44
    #7 0x5e004ffae0e9 in hsolver::HSolverLCAO<std::complex<double>, base_device::DEVICE_CPU>::hamiltSolvePsiK(hamilt::Hamilt<std::complex<double>, base_device::DEVICE_CPU>*, psi::Psi<std::complex<double>, base_device::DEVICE_CPU>&, double*) /home/liuyu/github/abacus-develop/source/module_hsolver/hsolver_lcao.cpp:135
    #8 0x5e004ffb1ea6 in hsolver::HSolverLCAO<std::complex<double>, base_device::DEVICE_CPU>::solve(hamilt::Hamilt<std::complex<double>, base_device::DEVICE_CPU>*, psi::Psi<std::complex<double>, base_device::DEVICE_CPU>&, elecstate::ElecState*, bool) /home/liuyu/github/abacus-develop/source/module_hsolver/hsolver_lcao.cpp:107
    #9 0x5e00503fdcf7 in ModuleESolver::ESolver_KS_LCAO_TDDFT::hamilt2density(int, int, double) /home/liuyu/github/abacus-develop/source/module_esolver/esolver_ks_lcao_tddft.cpp:172
    #10 0x5e00502ee7ff in ModuleESolver::ESolver_KS<std::complex<double>, base_device::DEVICE_CPU>::runner(int, UnitCell&) /home/liuyu/github/abacus-develop/source/module_esolver/esolver_ks.cpp:474
    #11 0x5e004f8cada4 in MD_func::force_virial(ModuleESolver::ESolver*, int const&, UnitCell&, double&, ModuleBase::Vector3<double>*, bool const&, ModuleBase::matrix&) /home/liuyu/github/abacus-develop/source/module_md/md_func.cpp:258
    #12 0x5e004f8c3d06 in MD_base::setup(ModuleESolver::ESolver*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /home/liuyu/github/abacus-develop/source/module_md/md_base.cpp:65
    #13 0x5e004f8f5752 in Verlet::setup(ModuleESolver::ESolver*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /home/liuyu/github/abacus-develop/source/module_md/verlet.cpp:20
    #14 0x5e004f8f01d0 in Run_MD::md_line(UnitCell&, ModuleESolver::ESolver*, Parameter const&) /home/liuyu/github/abacus-develop/source/module_md/run_md.cpp:54
    #15 0x5e004fe95d05 in Driver::driver_run() /home/liuyu/github/abacus-develop/source/driver_run.cpp:63
    #16 0x5e004fe90030 in Driver::atomic_world() /home/liuyu/github/abacus-develop/source/driver.cpp:186
    #17 0x5e004fe94998 in Driver::init() /home/liuyu/github/abacus-develop/source/driver.cpp:40
    #18 0x5e004f5d7be9 in main /home/liuyu/github/abacus-develop/source/main.cpp:42
    #19 0x743379629d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
    #20 0x743379629e3f in __libc_start_main_impl ../csu/libc-start.c:392
    #21 0x5e004f60b754 in _start (/home/liuyu/github/abacus-develop/build/abacus+0x2bc754)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV (/opt/intel/oneapi/mkl/2022.2.0/lib/intel64/libmkl_intel_lp64.so.2+0x2e138f) in zdotc_
==2700559==ABORTING
YuLiu98 commented 1 month ago

I tested 601_NO_TDDFT_CO on my workstation, and it seems another problem was caused by hsolver.

@lyb9812 can you look into this issue?

AddressSanitizer:DEADLYSIGNAL
=================================================================
==2700559==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000010 (pc 0x7433824e138f bp 0x000000000001 sp 0x7ffea2907b00 T0)
==2700559==The signal is caused by a READ memory access.
==2700559==Hint: address points to the zero page.
    #0 0x7433824e138f in zdotc_ (/opt/intel/oneapi/mkl/2022.2.0/lib/intel64/libmkl_intel_lp64.so.2+0x2e138f)
    #1 0x5e0050c17066 in pzpotf2_ (/home/liuyu/github/abacus-develop/build/abacus+0x18c8066)
    #2 0x5e0050ba798c in pzpotrf_ (/home/liuyu/github/abacus-develop/build/abacus+0x185898c)
    #3 0x5e0050ba322d in pzhegvx_ (/home/liuyu/github/abacus-develop/build/abacus+0x185422d)
    #4 0x5e004ffc9bd2 in hsolver::DiagoScalapack<std::complex<double> >::pzhegvx_once(int const*, int, int, std::complex<double> const*, std::complex<double> const*, double*, psi::Psi<std::complex<double>, base_device::DEVICE_CPU>&) const /home/liuyu/github/abacus-develop/source/module_hsolver/diago_scalapack.cpp:285
    #5 0x5e004ffccb0d in hsolver::DiagoScalapack<std::complex<double> >::pzhegvx_diag(int const*, int, int, std::complex<double> const*, std::complex<double> const*, double*, psi::Psi<std::complex<double>, base_device::DEVICE_CPU>&) /home/liuyu/github/abacus-develop/source/module_hsolver/diago_scalapack.cpp:371
    #6 0x5e004ffb45ba in hsolver::DiagoScalapack<std::complex<double> >::diag(hamilt::Hamilt<std::complex<double>, base_device::DEVICE_CPU>*, psi::Psi<std::complex<double>, base_device::DEVICE_CPU>&, double*) /home/liuyu/github/abacus-develop/source/module_hsolver/diago_scalapack.cpp:44
    #7 0x5e004ffae0e9 in hsolver::HSolverLCAO<std::complex<double>, base_device::DEVICE_CPU>::hamiltSolvePsiK(hamilt::Hamilt<std::complex<double>, base_device::DEVICE_CPU>*, psi::Psi<std::complex<double>, base_device::DEVICE_CPU>&, double*) /home/liuyu/github/abacus-develop/source/module_hsolver/hsolver_lcao.cpp:135
    #8 0x5e004ffb1ea6 in hsolver::HSolverLCAO<std::complex<double>, base_device::DEVICE_CPU>::solve(hamilt::Hamilt<std::complex<double>, base_device::DEVICE_CPU>*, psi::Psi<std::complex<double>, base_device::DEVICE_CPU>&, elecstate::ElecState*, bool) /home/liuyu/github/abacus-develop/source/module_hsolver/hsolver_lcao.cpp:107
    #9 0x5e00503fdcf7 in ModuleESolver::ESolver_KS_LCAO_TDDFT::hamilt2density(int, int, double) /home/liuyu/github/abacus-develop/source/module_esolver/esolver_ks_lcao_tddft.cpp:172
    #10 0x5e00502ee7ff in ModuleESolver::ESolver_KS<std::complex<double>, base_device::DEVICE_CPU>::runner(int, UnitCell&) /home/liuyu/github/abacus-develop/source/module_esolver/esolver_ks.cpp:474
    #11 0x5e004f8cada4 in MD_func::force_virial(ModuleESolver::ESolver*, int const&, UnitCell&, double&, ModuleBase::Vector3<double>*, bool const&, ModuleBase::matrix&) /home/liuyu/github/abacus-develop/source/module_md/md_func.cpp:258
    #12 0x5e004f8c3d06 in MD_base::setup(ModuleESolver::ESolver*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /home/liuyu/github/abacus-develop/source/module_md/md_base.cpp:65
    #13 0x5e004f8f5752 in Verlet::setup(ModuleESolver::ESolver*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /home/liuyu/github/abacus-develop/source/module_md/verlet.cpp:20
    #14 0x5e004f8f01d0 in Run_MD::md_line(UnitCell&, ModuleESolver::ESolver*, Parameter const&) /home/liuyu/github/abacus-develop/source/module_md/run_md.cpp:54
    #15 0x5e004fe95d05 in Driver::driver_run() /home/liuyu/github/abacus-develop/source/driver_run.cpp:63
    #16 0x5e004fe90030 in Driver::atomic_world() /home/liuyu/github/abacus-develop/source/driver.cpp:186
    #17 0x5e004fe94998 in Driver::init() /home/liuyu/github/abacus-develop/source/driver.cpp:40
    #18 0x5e004f5d7be9 in main /home/liuyu/github/abacus-develop/source/main.cpp:42
    #19 0x743379629d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
    #20 0x743379629e3f in __libc_start_main_impl ../csu/libc-start.c:392
    #21 0x5e004f60b754 in _start (/home/liuyu/github/abacus-develop/build/abacus+0x2bc754)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV (/opt/intel/oneapi/mkl/2022.2.0/lib/intel64/libmkl_intel_lp64.so.2+0x2e138f) in zdotc_
==2700559==ABORTING

Commit 15449beee4e78a44abf02b09b17344acf2994cb1 (PR #3681) is ok.

Commit bfe2925877609f6c6e2c8f0b4c4020833ae8a9ba caused this bug (PR #3623).

YuLiu98 commented 1 month ago

I cannot reproduce this issue on my workstation, @kirk0830 could you please check which PR introduces this bug?

kirk0830 commented 1 month ago

@YuLiu98 I will check it asap