t-sakashita / rokko

Integrated Interface for libraries of eigenvalue decomposition
Boost Software License 1.0
10 stars 2 forks source link

pdsyevxが実行時エラー #404

Open t-sakashita opened 4 years ago

t-sakashita commented 4 years ago

MacにてOpenMPI、clang++、gfortran

mpirun -np 1 --oversubscribe xterm -e lldb -o run ./minij_mpi scalapack:pdsyevx

デバッガの出力:

(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGABRT
  * frame #0: 0x00007fff52227b66 libsystem_kernel.dylib`__pthread_kill + 10
    frame #1: 0x00007fff523f2080 libsystem_pthread.dylib`pthread_kill + 333
    frame #2: 0x00007fff5218324d libsystem_c.dylib`__abort + 144
    frame #3: 0x00007fff52183af8 libsystem_c.dylib`__stack_chk_fail + 205
    frame #4: 0x00000001001bc99a librokko.dylib`cscalapack_pdsyevx(jobz='V', range='A', uplo='U', n=10, A=0x000000011142b9f0, ia=0, ja=0, descA=0x0000000111429380, vl=0, vu=0, il=0, iu=0, abstol=2.2250738585072014E-308, m=0x00007ffeefbfda14, nZ=0x00007ffeefbfda10, w=0x00000001114293b0, orfac=-1, Z=0x000000011142c030, iz=0, jz=0, descZ=0x0000000111429380, ifail=0x000000011121f180, iclustr=0x000000011121f5c0, gap=0x000000011121f5d0) at pdsyevx.c:0
    frame #5: 0x00000001001c508a librokko.dylib`rokko::parameters rokko::scalapack::diagonalize_pdsyevx<rokko::matrix_col_major, Eigen::Matrix<double, -1, 1, 0, -1, 1> >(mat=0x00007ffeefbfe888, eigvals=0x00007ffeefbfe860, eigvecs=0x00007ffeefbfe7b8, params=0x00007ffeefbfe7a0) at diagonalize_pdsyevx.hpp:45
    frame #6: 0x00000001001c3b1a librokko.dylib`rokko::parameters rokko::scalapack::solver::diagonalize<rokko::matrix_col_major, Eigen::Matrix<double, -1, 1, 0, -1, 1> >(this=0x000000011112a548, mat=0x00007ffeefbfe888, eigvals=0x00007ffeefbfe860, eigvecs=0x00007ffeefbfe7b8, params=0x00007ffeefbfe7a0) at core.hpp:71
    frame #7: 0x00000001001be5fb librokko.dylib`rokko::detail::pd_ev_wrapper<rokko::scalapack::solver>::diagonalize(this=0x000000011112a540, mat=0x00007ffeefbfe888, eigvals=0x00007ffeefbfe860, eigvecs=0x00007ffeefbfe7b8, params=0x00007ffeefbfe7a0) at parallel_dense_ev.hpp:88
    frame #8: 0x000000010000f9ab minij_mpi`rokko::parameters rokko::parallel_dense_ev::diagonalize<rokko::matrix_col_major, Eigen::Matrix<double, -1, 1, 0, -1, 1> >(this=0x00007ffeefbfe9d0, mat=0x00007ffeefbfe888, eigvals=0x00007ffeefbfe860, eigvecs=0x00007ffeefbfe7b8, params=0x00007ffeefbfe7a0) at parallel_dense_ev.hpp:159
    frame #9: 0x000000010000ea6d minij_mpi`main(argc=2, argv=0x00007ffeefbfecf8) at minij_mpi.cpp:62
    frame #10: 0x00007fff520d7015 libdyld.dylib`start + 1
    frame #11: 0x00007fff520d7015 libdyld.dylib`start + 1
t-sakashita commented 4 years ago

cscalapack_pdsyevxの0行目で強制終了されるのは、なぜか?

t-sakashita commented 4 years ago

example/scalapack/pdsyevx_f.f90がseg fault

(lldb) run
 n =           8
 nprocs =           1
 nprow =           1
 npcol =           1
 eigenvalues:  0.25873593027213443       0.28752008977568855       0.34584404432670607       0.45776296243322334       0.68838568483467788        1.2582878272128752        3.3381655667727590        29.365297894371942     
Process 56298 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x30)
    frame #0: 0x0000000109768399 libmpi.40.dylib`coll_base_module_destruct + 57
libmpi.40.dylib`coll_base_module_destruct:
->  0x109768399 <+57>: movq   0x30(%rax), %rbx
    0x10976839d <+61>: movq   (%rbx), %rax
    0x1097683a0 <+64>: testq  %rax, %rax
    0x1097683a3 <+67>: je     0x1097683c1               ; <+97>
Target 0: (pdsyevx_f) stopped.

Process 56298 launched: './pdsyevx_f' (x86_64)
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x30)
  * frame #0: 0x0000000109768399 libmpi.40.dylib`coll_base_module_destruct + 57
    frame #1: 0x00000001097673f4 libmpi.40.dylib`mca_coll_base_comm_unselect + 10703
    frame #2: 0x00000001096d7bd7 libmpi.40.dylib`ompi_comm_destruct + 32
    frame #3: 0x00000001096d986b libmpi.40.dylib`ompi_comm_free + 469
    frame #4: 0x0000000109705209 libmpi.40.dylib`MPI_Comm_free + 153
    frame #5: 0x0000000100a0e3f3 libscalapack.dylib`blacs_gridexit_(ConTxt=0x00007ffeefbfeb2c) at blacs_grid_.c:26
    frame #6: 0x000000010000383a pdsyevx_f`MAIN__ at pdsyevx_f.f90:77
    frame #7: 0x0000000100003889 pdsyevx_f`main at pdsyevx_f.f90:2
    frame #8: 0x00007fff501a0015 libdyld.dylib`start + 1
    frame #9: 0x00007fff501a0015 libdyld.dylib`start + 1
t-sakashita commented 4 years ago

C++を使わないexample/scalapack/pdsyevx.cの段階でエラー

t-sakashita commented 4 years ago

cscalapack_pdsyevx_workをコメントアウトしたら、エラーは出ない。 そのため、cscalapack_pdsyevx_workの中に原因がありそう。