geodynamics / sw4

SW4 (Seismic Waves, 4th order) implements substantial capabilities for 3-D seismic modeling, with a free surface condition on the top boundary, absorbing super-grid conditions on the far-field boundaries, and an arbitrary number of point force and/or point moment tensor source terms.
Other
132 stars 65 forks source link

failure in running .test_sw4.py at Test 9 #211

Open batkillerz opened 7 months ago

batkillerz commented 7 months ago

Hi

I managed to install sw4 using make successfully '-.,_,.-''-.,,.='``'-.,,.-''-.,_,.='````'-.,_,.-''-.,_,.='``


/ \ \ \ / \ / / / | | | | | ./ \ \ / \ / / / | | | | |__ \ \/ \/ / / '--' | __ \ \ / |__ | | | \ /\ / | | /`__| | \ / \ / | | ___/ _/ _/ |__|


| | | | \ \ / / | __| / | | | | | | | \ \/ / | | | ( | | | | | | \ / | | _ | | | | `----.| | \ / | |__ ) | || |__||| _/ |____| (___/ ()

'-.,_,.-''-.,,.='``'-.,,.-''-.,_,.='````'-.,_,.-''-.,_,.='``

But when I try to run ./test_sw4.py -u 0 -d debug_mp/ -v, it fails at state 10 as follows

Test # 9 Input file: tw-att-2.in PASSED Starting test # 10 in directory: attenuation with input file: tw-topo-att-1.in Running sw4 from directory: /home/batkillerz/sw4_base/src/sw4-3.0/pytest/attenuation run_cmd= ['mpirun', '-np', '1', '/home/batkillerz/sw4_base/src/sw4-3.0/debug_mp//sw4', '/home/batkillerz/sw4_base/src/sw4-3.0/pytest/reference/attenuation/tw-topo-att-1.in'] ERROR: Test tw-topo-att-1.in : sw4 returned non-zero exit status= 1 aborting test run_cmd= ['mpirun', '-np', '1', '/home/batkillerz/sw4_base/src/sw4-3.0/debug_mp//sw4', '/home/batkillerz/sw4_base/src/sw4-3.0/pytest/reference/attenuation/tw-topo-att-1.in'] DID YOU USE THE CORRECT SW4 EXECUTABLE? (SPECIFY DIRECTORY WITH -d OPTION) test_sw4 was unsuccessful

when I try to run the last command,"mpirun -np 4 /home/batkillerz/sw4_base/src/sw4-3.0/debug_mp/sw4 /home/batkillerz/sw4_base/src/sw4-3.0/pytest/reference/attenuation/tw-topo-att-1.in", the following error came out

        sw4 version 3.0

This program comes with ABSOLUTELY NO WARRANTY; released under GPL. This is free software, and you are welcome to redistribute it under certain conditions, see LICENSE.txt for more details

Compiled on: Fri Apr 19 04:49:41 PM +08 2024 By user: batkillerz Machine: homelab Compiler: /storage/software/openmpi/5.0.1/bin/mpicxx 3rd party include dir: /include, and library dir: /lib

Input file: /home/batkillerz/sw4_base/src/sw4-3.0/pytest/reference/attenuation/tw-topo-att-1.in Default Supergrid thickness has been tuned; # grid points = 1 grid sizes Default Supergrid damping coefficient has been tuned; damping coefficient = 0.00000000e+00

Rank=0, Grid #1 (curvilinear), iInterior=[1,26], jInterior=[1,26] Rank=0, Grid #0 (Cartesian), iInterior=[1,26], jInterior=[1,26], kInterior=[1,27] inside allocateCurvilinearArrays

***Topography grid: min z = -5.967331e-01, max z = -2.932192e-58, top Cartesian z = 3.000000e+00

Global grid sizes (without ghost points) Grid h Nx Ny Nz Points Type 0 0.1256 51 51 27 70227 Cartesian 1 0.1256 51 51 27 70227 Curvilinear Total number of grid points (without ghost points): 140454

Default Supergrid damping coefficient has been tuned; damping coefficient = 0.00000000e+00 Default Supergrid thickness has been tuned; # grid points = 1 grid sizes

Execution time, reading input file 6.77432830e-02 seconds Assuming a SERIAL file system. Detected at least one boundary with supergrid conditions

Making Directory: tw-topo-att-1/

... Done!

Geographic and Cartesian coordinates of the corners of the computational grid: 0: Lon= -1.180000e+02, Lat=3.700000e+01, x=0.000000e+00, y=0.000000e+00 1: Lon= -1.180000e+02, Lat=3.700006e+01, x=6.280000e+00, y=0.000000e+00 2: Lon= -1.179999e+02, Lat=3.700006e+01, x=6.280000e+00, y=6.280000e+00 3: Lon= -1.179999e+02, Lat=3.700000e+01, x=0.000000e+00, y=6.280000e+00


ASSIGNING TWILIGHT MATERIALS


   ----------- Material properties ranges ---------------
   1.00118341e+00 kg/m^3 <=  Density <= 2.99885859e+00 kg/m^3
   1.63353903e+00 m/s    <=  Vp      <= 2.82632270e+00 m/s
   1.00033876e+00 m/s    <=  Vs      <= 1.73075388e+00 m/s
   1.52767088e+00        <=  Vp/Vs   <= 1.73199227e+00
   2.00118341e+00 Pa     <=  mu      <= 3.99885859e+00 Pa
   1.00157479e+00 Pa     <=  lambda  <= 2.99848185e+00 Pa
   ------------------------------------------------------

* PPW = minVs/h/maxFrequency **** g=0, h=1.256000e-01, minVs/h=7.96448 (Cartesian) g=1, h=1.256000e-01, minVs/h=7.96466 (curvilinear)

*** Attenuation parameters calculated for 1 mechanisms, max freq=2.000000e+00 [Hz], min_freq=2.000000e-02 [Hz], velo_freq=1.000000e+00 [Hz]

Assigned material properties computing the time step [homelab:576512] Process received signal [homelab:576512] Signal: Segmentation fault (11) [homelab:576512] Signal code: Address not mapped (1) [homelab:576512] Failing at address: 0x7ffc1189c000 [homelab:576511] Process received signal [homelab:576511] Signal: Segmentation fault (11) [homelab:576511] Signal code: Address not mapped (1) [homelab:576511] Failing at address: 0x7ffedd464000 Message from routine DSPEV in library SLATEC. Potentially recoverable error, Prog aborted, Traceback requested

[homelab:576512] [homelab:576511] [ 8] /lib64/libc.so.6(+0x3feb0)[0x7f99d543feb0]

prterun has exited due to process rank 3 with PID 576513 on node homelab exiting improperly. There are three reasons this could occur:

  1. this process did not call "init" before exiting, but others in the job did. This can cause a job to hang indefinitely while it waits for all processes to call "init". By rule, if one process calls "init", then ALL processes must call "init" prior to termination.

  2. this process called "init", but exited without calling "finalize". By rule, all processes that call "init" MUST call "finalize" prior to exiting or it will be considered an "abnormal termination"

  3. this process called "MPI_Abort" or "prte_abort" and the mca parameter prte_create_session_dirs is set to false. In this case, the run-time cannot detect that the abort call was an abnormal termination. Hence, the only error message you will receive is this one.

This may have caused other processes in the application to be terminated by signals sent by prterun (as reported here).

You can avoid this message by specifying -quiet on the prterun command line.

Any idea how I can fix this? Thanks!