SALMON-TDDFT / SALMON2

SALMON 2.0.0 Development Repository
https://salmon-tddft.jp/
Apache License 2.0
12 stars 9 forks source link

Tests 101-131 failed with `--disable-mpi` #421

Closed yhirokawa-ccs closed 4 years ago

yhirokawa-ccs commented 4 years ago

I found that many test cases crash when executing a OpenMP version (built with --disable-mpi).

Test environment: Intel compiler 2019u5 on Xeon Gold 6242.

$ CFLAGS="-traceback" FFLAGS="-fpe0 -fpe-all=0 -check all,noarg_temp_created -traceback" ~/salmon2/configure.py --arch=intel-avx512 --debug --disable-mpi && make -j8

1. testcase 101 ``` 11: ============init_ps============== 11: forrtl: severe (174): SIGSEGV, segmentation fault occurred 11: Image PC Routine Line Source 11: salmon.cpu 000000000123E833 Unknown Unknown Unknown 11: libpthread-2.17.s 00007F80499865F0 Unknown Unknown Unknown 11: salmon.cpu 0000000000D88C6F prep_pp_sub_mp_ca 907 prep_pp.f90 11: salmon.cpu 0000000000D5BDEF prep_pp_sub_mp_in 112 prep_pp.f90 11: salmon.cpu 0000000000952DB4 maindft 130 main_dft.f90 11: salmon.cpu 0000000000407DF3 MAIN__ 37 main.f90 11: salmon.cpu 0000000000407BE2 Unknown Unknown Unknown 11: libc-2.17.so 00007F80492C9505 __libc_start_main Unknown Unknown 11: salmon.cpu 0000000000407AE9 Unknown Unknown Unknown ```

  1. testcase 104, 105, 116, 117, 121, 131

    20:   Libxc: [disabled]
    20:  inumcpu_check error!
    20:  number of cpu is not correct!
  2. testcase 122

    50:  ============init_ps==============
    50:  restart/Si_gs.bin
    50:
    50: forrtl: No such file or directory
    50: forrtl: severe (29): file not found, unit 96, file /fs01/homes/hirokawa/salmon2-build/single/testsuites/122_bulk_Si_rt_response_temperature_dp/restart/Si_gs.bin

4. testcase 111, 112, 113 (known error) ``` 27: ############################################################ 27: # Verification start 27: # Checking the existance of outputfile 27: # Checking calculated result 27: Result eigen energy for io=1, ik=2 = -1.708282e-01 (Reference = -1.716932e-01) 27: Mismatch |-1.708282e-01 - -1.716932e-01| > 4.000000e-05) 3/3 Test #27: verify_111_bulk_Si_gs_dp .........***Failed 0.03 sec ... 30: ############################################################ 30: # Verification start 30: # Checking the existance of outputfile 30: # Checking calculated result 30: -0.00030664957 30: Result Current = -3.066496e-04 (Reference = -2.970908e-04) 30: Mismatch |-3.066496e-04 - -2.970908e-04| > 1.000000e-08) 3/3 Test #30: verify_112_bulk_Si_rt_response_dp ...***Failed 0.01 sec ... 33: ############################################################ 33: # Verification start 33: # Checking the existance of outputfile 33: # Checking calculated result 33: 0.00064904193 33: Result Current = 6.490419e-04 (Reference = 6.571660e-04) 33: Mismatch |6.490419e-04 - 6.571660e-04| > 1.000000e-08) 3/3 Test #33: verify_113_bulk_Si_rt_pulse_dp ...***Failed 0.03 sec ```

yhirokawa-ccs commented 4 years ago

A error of test 101 was my mistake due to system environment.

yhirokawa-ccs commented 4 years ago

Test 111-113 fixed by PR #440

yhirokawa-ccs commented 4 years ago

test 104, 105, 116, 117, 121, 122, and 131 requires MPI parallelization, we skip the tests when executing OpenMP version. (PR #426)

All problem were fixed. I close it.