NOAA-EMC / hpc-stack

Create a software stack for HPC's
GNU Lesser General Public License v2.1
30 stars 36 forks source link

[INSTALL] GNU hpc-stack using openmpi library #497

Open DusanJovic-NOAA opened 1 year ago

DusanJovic-NOAA commented 1 year ago

Which software in the stack would you like installed? All currently installed libraries in gnu/mpich stack.

What is the version/tag of the software? Same versions as in gnu/mpich stack

What compilation options would you like set? Same compilation options, same compiler, but use openmpi MPI library.

Which machines would you like to have the software installed? Hera and Cheyenne.

DusanJovic-NOAA commented 1 year ago

See: https://github.com/ufs-community/ufs-weather-model/pull/1147

ulmononian commented 1 year ago

@DusanJovic-NOAA just to confirm with you: which pair of gnu / openmpi would you like? @natalie-perlin has installed the stack with gnu-10.2 / openmpi 4.1.2 here: /scratch1/NCEPDEV/nems/role.epic/hpc-stack/libs/gnu-10.2_openmpi. the hpc-stack built with gnu-9.2.0 / openmpi-3.1.4 can be found here: /scratch1/NCEPDEV/nems/role.epic/hpc-stack/libs/gnu-9.2_openmpi-3.1.4.

Hang-Lei-NOAA commented 1 year ago

@DusanJovic-NOAA Please let me know if the EPIC installations have address your issue. Thanks,

DusanJovic-NOAA commented 1 year ago

Unfortunately not yet. The hpc-stack installation in this directory /scratch1/NCEPDEV/nems/role.epic/hpc-stack/libs/gnu-9.2_openmpi-3.1.4/modulefiles/stack is missing "fms/2022.04"

Hang-Lei-NOAA commented 1 year ago

I will work with epic to get this solved soon.

jkbk2004 commented 1 year ago

@natalie-perlin can you follow up- on this issue?

natalie-perlin commented 1 year ago

@jkbk2004 - sure.

natalie-perlin commented 1 year ago

fms 2022.04 and fms-2022.03 have been added to hpc-stack installed in /scratch1/NCEPDEV/nems/role.epic/hpc-stack/libs/gnu-9.2_openmpi-3.1.4/

Added a couple of more packages/versions : gftl-shared/1.3.3 (in addition to earlier installed 1.5.0) mapl/2.11.0 (in addition to 2.22.0

DusanJovic-NOAA commented 1 year ago

Thanks @natalie-perlin. I compiled the code successfully using gnu-9.2_openmpi-3.1.4 stack, but the model crashes at runtime with this error:

124: --------------------------------------------------------------------------
124: The OSC pt2pt component does not support MPI_THREAD_MULTIPLE in this release.
124: Workarounds are to run on a single node, or to use a system with an RDMA
124: capable network such as Infiniband.
124: --------------------------------------------------------------------------
124: [h19c24:165101] *** An error occurred in MPI_Win_create
124: [h19c24:165101] *** reported by process [3428253696,124]
124: [h19c24:165101] *** on communicator MPI COMMUNICATOR 56 DUP FROM 55
124: [h19c24:165101] *** MPI_ERR_WIN: invalid window
124: [h19c24:165101] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
124: [h19c24:165101] ***    and potentially your MPI job)