ESCOMP / CTSM

Community Terrestrial Systems Model (includes the Community Land Model of CESM)
http://www.cesm.ucar.edu/models/cesm2.0/land/
Other
305 stars 308 forks source link

Use full MPI library for mpi-serial tests #2497

Open ekluzek opened 5 months ago

ekluzek commented 5 months ago

As discussed in CSEG we want to remove the use of mpi-serial in our tests and simulations. This is largely because modern MPI libraries allow you to link with the MPI library but still run serially WITHOUT using mpirun (mpiexec, mpibind, or any of the other flavors).

This depends on getting the mpirun update in cime here:

https://github.com/ESMCI/cime/issues/4619

Or doing this explicitly for Derecho and Izumi in ccs_config.

ekluzek commented 5 months ago

As pointed out by @wwieder outside of the test lists we also have these settings:

NEON/FATES/defaults/shell_commands:# Explicitly set the MPI library to mpi-serial so won't have the build/run complexity of a full MPI library
NEON/FATES/defaults/shell_commands:./xmlchange MPILIB=mpi-serial
NEON/defaults/shell_commands:# Explicitly set the MPI library to mpi-serial so won't have the build/run complexity of a full MPI library
NEON/defaults/shell_commands:./xmlchange MPILIB=mpi-serial
PLUMBER2/defaults/shell_commands:# Explicitly set the MPI library to mpi-serial so won't have the build/run complexity of a full MPI library
PLUMBER2/defaults/shell_commands:./xmlchange MPILIB=mpi-serial

And under python code:

ctsm/site_and_regional/single_point_case.py:            self.write_to_file("./xmlchange MPILIB=mpi-serial", nl_file)
ctsm/test/testinputs/mksurfdata_esmf_bld/env_mach_specific.xml:    <modules mpilib="mpi-serial">
ctsm/test/testinputs/mksurfdata_esmf_bld/env_mach_specific.xml:      <command name="load">mpi-serial/2.3.0</command>
ctsm/test/testinputs/mksurfdata_esmf_bld/env_mach_specific.xml:    <modules DEBUG="TRUE" compiler="intel" mpilib="mpi-serial">
ctsm/test/testinputs/mksurfdata_esmf_bld/env_mach_specific.xml:    <modules DEBUG="FALSE" compiler="intel" mpilib="mpi-serial">
ctsm/test/testinputs/mksurfdata_esmf_bld/env_mach_specific.xml:    <modules DEBUG="TRUE" compiler="gnu" mpilib="mpi-serial">
ctsm/test/testinputs/mksurfdata_esmf_bld/env_mach_specific.xml:    <modules DEBUG="FALSE" compiler="gnu" mpilib="mpi-serial">
ctsm/test/testinputs/mksurfdata_esmf_bld/env_mach_specific.xml:    <modules DEBUG="TRUE" compiler="pgi" mpilib="mpi-serial">
ctsm/test/testinputs/mksurfdata_esmf_bld/env_mach_specific.xml:    <modules DEBUG="FALSE" compiler="pgi" mpilib="mpi-serial">
ctsm/test/testinputs/mksurfdata_esmf_bld/env_mach_specific.xml:    <modules DEBUG="TRUE" compiler="nvhpc" mpilib="mpi-serial">
ctsm/test/testinputs/mksurfdata_esmf_bld/env_mach_specific.xml:    <modules DEBUG="FALSE" compiler="nvhpc" mpilib="mpi-serial">
ctsm/test/testinputs/mksurfdata_esmf_bld/env_mach_specific.xml:    <modules compiler="gnu" mpilib="mpi-serial">
ctsm/test/testinputs/mksurfdata_esmf_bld/env_mach_specific.xml:    <modules compiler="intel" mpilib="mpi-serial">
ctsm/test/testinputs/mksurfdata_esmf_bld/env_mach_specific.xml:    <modules compiler="pgi" mpilib="mpi-serial">
ctsm/test/testinputs/mksurfdata_esmf_bld/env_mach_specific.xml:    <modules compiler="nvhpc" mpilib="mpi-serial">

And in the documentation under doc/source

lilac/specific-atm-models/wrf-tools.rst:     ../../../configure --macros-format Makefile --mpilib mpi-serial
users_guide/running-single-points/running-pts_mode-configurations.rst:Note, that when running with ``PTS_MODE`` the number of processors is automatically set to one. When running a single grid point you can only use a single processor. You might also want to set the ``env_build.xml`` variable: ``MPILIB=mpi-serial`` to ``TRUE`` so that you can also run interactively without having to use MPI to start up your job.
users_guide/running-single-points/running-single-point-configurations.rst:   Just like ``PTS_MODE`` (Sect. :numref:`pts_mode`), by default these setups sometimes run with ``MPILIB=mpi-serial`` (in the ``env_build.xml`` file) turned on, which allows you to run the model interactively. On some machines this mode is NOT supported and you may need to change it to FALSE before you are able to build.
users_guide/trouble-shooting/trouble-shooting.rst:Simplifying to one processor removes all multi-processing problems and makes the case as simple as possible. If you can enable ``MPILIB=mpi-serial`` you will also be able to run interactively rather than having to submit to a job queue, which sometimes makes it easier to run and debug. If you can use ``MPILIB=mpi-serial`` you can also use threading, but still run interactively in order to use more processors to make it faster if needed.
users_guide/trouble-shooting/trouble-shooting.rst:   # set MPILIB to mpi-serial so that you can run interactively
users_guide/trouble-shooting/trouble-shooting.rst:   > ./xmlchange MPILIB=mpi-serial