MPAS-Dev / MPAS

Repository for private MPAS development prior to the MPAS v6.0 release.
Other
4 stars 0 forks source link

Test if pointer comm list is associated #1464

Closed mark-petersen closed 6 years ago

mark-petersen commented 6 years ago

Prevent use of commListPtr when it is not associated. This is a bug that was introduced in #1459. It causes an error in MPAS-Ocean only in certain configurations, which is why previous testing did not catch it. It affects MPAS-Ocean init mode, and some E3SM configurations. The fix in this merge only impacts grouped halo exchanges. In testing the error was caused when exchangeGroup % sendList was not associated on the very first halo exchange after a file was read in, but only in certain cases.

mark-petersen commented 6 years ago

@mgduda can I merge this in? It corrects an error made last week in #1459, and is holding other things up for us.

mgduda commented 6 years ago

@mark-petersen I'm testing now in a branch of atmosphere/develop that uses grouped halo exchanges, and I'll report back before lunch.

mark-petersen commented 6 years ago

To understand why exchangeGroup % sendList was not associated, I reran my failing test, the ocean model in init mode, using intel debug, and writing out:

diff --git a/src/framework/mpas_stream_manager.F b/src/framework/mpas_stream_manager.F
-#define STREAM_DEBUG_WRITE(M) ! call mpas_log_write(M)
+#define STREAM_DEBUG_WRITE(M) call mpas_log_write( M )

During initialization the model reads the files for all input streams. After it reads in a variable it conducts a halo exchange. The error occurs on the first file and first variable. The traceback is below. Sorry, I could not figure out why this was any different than our forward mode configuration, which does not cause an error. I spent 30 minutes on it and decided to stop, because this bug fix prevents the problem and is the same behavior as before #1459.

wf106:init_step2$ mpirun  -n 1 /usr/projects/climate/mpeterse/repos/MPAS/ocean_develop/ocean_model
Reported: 1 (out of 1) daemons - 1 (out of 1) procs
 Note: MPAS has requested an MPI threading level of MPI_THREAD_MULTIPLE, but
       this is not supported by the MPI implementation; a threading level of
       MPI_THREAD_SINGLE will be used instead.
forrtl: severe (408): fort: (7): Attempt to use pointer COMMLISTPTR when it is not associated with a target

Image              PC                Routine            Line        Source
ocean_model        0000000003EAD220  Unknown               Unknown  Unknown
ocean_model        000000000376929F  mpas_dmpar_mp_mpa        8466  mpas_dmpar.F
ocean_model        0000000003757330  mpas_dmpar_mp_mpa        7760  mpas_dmpar.F
ocean_model        0000000003748A48  mpas_dmpar_mp_mpa        7168  mpas_dmpar.F
ocean_model        00000000037495D4  mpas_dmpar_mp_mpa        7236  mpas_dmpar.F
ocean_model        000000000398F744  mpas_stream_manag        4642  mpas_stream_manager.F
ocean_model        0000000003987890  mpas_stream_manag        3939  mpas_stream_manager.F
ocean_model        0000000003981BD3  mpas_stream_manag        3494  mpas_stream_manager.F
ocean_model        00000000024229E2  ocn_init_mode_mp_         121  mpas_ocn_init_mode.F
ocean_model        00000000028438CA  ocn_core_mp_ocn_c          80  mpas_ocn_core.F
ocean_model        000000000041581D  mpas_subdriver_mp         331  mpas_subdriver.F
ocean_model        000000000041066F  MAIN__                     14  mpas.F

wf106:init_step2$ tail log.ocean.0000.out
 -- Called MPAS_stream_mgr_read()
 -- Handling read of stream input_init
  -- Stream filename is: mesh.nc
 Is field 'latCell' active in stream 'input_init? **
 Is field 'lonCell' active in stream 'input_init? **
...
  Seeking time of 0001-01-01_00:00:00
WARNING: File mesh.nc does not contain a seekable xtime variable. Forcing a read of the first time record.
  -- Exchange halo for latCell

The error is coming from this stream:

<immutable_stream name="input_init"
                  filename_template="mesh.nc"
                  input_interval="initial_only"
                  type="input"/>