ufs-community / ufs-weather-model

UFS Weather Model
Other
130 stars 238 forks source link

Fix type mismatch compiler error when gfortran 10 is used without '-fallow-argument-mismatch' flag #1147

Closed DusanJovic-NOAA closed 2 months ago

DusanJovic-NOAA commented 2 years ago

Commit Queue Requirements:

WARNING: We are currently using mpich MPI library with the gnu compilers on Hera and SGI MPT on Cheyenne. mpi_f08 module in mpich, when compiled with the current versions of the gnu compilers, has some issues and MPT does not provide mpi_f08 module at all. Which means this PR will require us to switch to OpenMPI, which will require hpc-stack to be rebuild on these two platforms. Do we want to do that? Do we want to make (working) mpi_f08 module a requirement for ufs-weather-model?

Commit Message:

* UFSWM - Fix type mismatch compiler error when gfortran 10 is used without '-fallow-argument-mismatch' flag.
  * FV3 - Fix type mismatch compiler error when gfortran 10 is used without '-fallow-argument-mismatch' flag.
    * ccpp-physics - Resolve various subroutine argument mismatches.
    * ccpp-framework - Add support to use mpi_f08 MPI module .
  * stochastic_physics - Fix type mismatch compiler error when gfortran 10 is used without '-fallow-argument-mismatch' flag.

Priority:

Git Tracking

UFSWM:

Sub component Pull Requests:

UFSWM Blocking Dependencies:


Changes

Regression Test Changes (Please commit test_changes.list):

Input data Changes:

Library Changes/Upgrades:


Testing Log:

junwang-noaa commented 2 years ago

If mpi_f08 can fix the issue of type mismatch, I think we can add mpi_f08 module as a requirement for ufs-weather-model to support gnu compiler. @jkbk2004 would you please check if we can switch to OpenMP from SGI MPT on Cheyenne with gnu compiler and reinstall hpc-stack? Thanks

jkbk2004 commented 2 years ago

@junwang-noaa I will give a try to install openmpi/gnu and see if its doable.

climbfuji commented 2 years ago

@DusanJovic-NOAA @junwang-noaa @kgerheiser I captured this and related information in the spack-stack repo: https://github.com/NOAA-EMC/spack-stack/issues/109

We will try to rebuild spack-stack on the various platforms using gcc, (LLVM) clang or apple-clang with open-mpi. If that all works, then it would good to merge the changes in this ufs-weather-model PR when the switch to spack-stack is made.

junwang-noaa commented 2 years ago

@climbfuji Have you tried rebuilding spacl_stack with gcc and open-mpi? Does that work?

climbfuji commented 2 years ago

@climbfuji Have you tried rebuilding spacl_stack with gcc and open-mpi? Does that work?

Your question is timely, as I was about to post an update here.

There are some issues with using openmpi on macOS that I started investigating last week. The problem has to do with flat namespaces versus two-level namespaces. mpich supports two-level namespaces through a configure option, openmpi does not. I am trying to achieve the same by setting appropriate linker flags when building openmpi and when building apps with openmpi.

Without two-level namespaces, there is a problem with mixing the libc++ from macOS (part of the native clang, and similar for LLVM clang) and libstdc++ from gcc, that results in exceptions not being caught correctly etc. I hope to have a good answer in the first week of May for whether switching to openmpi is a viable solution on macOS or not.

climbfuji commented 2 years ago

@junwang-noaa @DusanJovic-NOAA Good news. I was able to compile openmpi such that it mimics the two-level namespace option of mpich on macOS. We can therefore switch to openmpi for supporting mpi_f08.

DusanJovic-NOAA commented 1 year ago

Still waiting on gnu/openmpi hpc-stack or spack-stack to be installed on Hera and Cheyenne.

climbfuji commented 1 year ago

Still waiting on gnu/openmpi hpc-stack or spack-stack to be installed on Hera and Cheyenne.

I have a gnu-openmpi spack-stack that can be used for testing on Cheyenne, but it's probably not the final location and the responsibility for Cheyenne should probably be with EPIC, not JCSDA (but it's fine to do this in the transition period).

DusanJovic-NOAA commented 1 year ago

Still waiting on gnu/openmpi hpc-stack or spack-stack to be installed on Hera and Cheyenne.

jkbk2004 commented 1 year ago

On epic side, a help desk ticket was issued to install gnu10.1 on hera. They are working on. Maybe some uodate might be available early next week. Regarding the spack-stack, there are some issues on compiling stage: git discussion #1346 Can you take a look? Sounds like we need more discussion about the issue.

DusanJovic-NOAA commented 1 year ago

On epic side, a help desk ticket was issued to install gnu10.1 on hera. They are working on. Maybe some uodate might be available early next week. Regarding the spack-stack, there are some issues on compiling stage: git discussion #1346 Can you take a look? Sounds like we need more discussion about the issue.

I do not have access to Frontera. Is spack-stack installation available on Hera?

climbfuji commented 1 year ago

I thought the problems on frontera were sorted out, according to the discussion. @Hang-Lei-NOAA and I will install spack-stack 1.1.0 on hera in the next few days, hopefully we can use this as a basis for the migration of the UFS to spack-stack.

DusanJovic-NOAA commented 1 year ago

Is there any progress in installing gnu/openmpi hpc-stack or spack-stack on Hera and Cheyenne?

Hang-Lei-NOAA commented 1 year ago

GNU hpc-stack was there on the official hpc-stack installation.

There is a spack-stack installation on

module use /scratch1/NCEPDEV/jcsda/jedipara/spack-stack/modulefiles module load miniconda/3.9.12 module load ecflow/5.5.3

On Mon, Oct 17, 2022 at 2:23 PM Dusan Jovic @.***> wrote:

Is there any progress in installing gnu/openmpi hpc-stack or spack-stack on Hera and Cheyenne?

— Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-weather-model/pull/1147#issuecomment-1281295661, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKWSMFBHJW4KABW4TB7OF53WDWKR7ANCNFSM5R6RO57Q . You are receiving this because you were mentioned.Message ID: @.***>

climbfuji commented 1 year ago

The full answer is here: https://spack-stack.readthedocs.io/en/latest/Platforms.html#noaa-rdhpcs-hera

Note that there are also two other build environments that match the hpc-stack version of the libraries in /scratch1/NCEPDEV/global/spack-stack/spack-stack-v1/envs/hpc-stack-dev-gnu-9.2.0 and /scratch1/NCEPDEV/global/spack-stack/spack-stack-v1/envs/hpc-stack-dev-intel-2022.0.2/.

Note also that @mark-a-potts and @AlexanderRichert-NOAA are testing the ufs-weather-model with these software stacks.

DusanJovic-NOAA commented 1 year ago

GNU hpc-stack was there on the official hpc-stack installation.

@Hang-Lei-NOAA Do you mean the installation here /scratch2/NCEPDEV/nwprod/hpc-stack/libs/hpc-stack/gnu-9.2.0 ? I see only mpich not openmpi library.

DusanJovic-NOAA commented 1 year ago

The full answer is here: https://spack-stack.readthedocs.io/en/latest/Platforms.html#noaa-rdhpcs-hera

Note that there are also two other build environments that match the hpc-stack version of the libraries in /scratch1/NCEPDEV/global/spack-stack/spack-stack-v1/envs/hpc-stack-dev-gnu-9.2.0 and /scratch1/NCEPDEV/global/spack-stack/spack-stack-v1/envs/hpc-stack-dev-intel-2022.0.2/.

Note also that @mark-a-potts and @AlexanderRichert-NOAA are testing the ufs-weather-model with these software stacks.

I tried to build and run the control regression test using /scratch2/NCEPDEV/nwprod/hpc-stack/libs/hpc-stack/modulefiles/stack. I addition to using this stack and loading stack-gcc/9.2.0 and stack-openmpi/3.1.4 I had to rename two modules in ufs_common (netcdf to netcdf-c and pio to parallelio) and remove one (gftl-shared) which apparently is missing. Model then compiled successfully, but unfortunately it crashes with this error:

 85: /scratch1/NCEPDEV/stmp2/Dusan.Jovic/FV3_RT/rt_40700/control/./fv3.exe: /usr/lib64/libz.so.1: version `ZLIB_1.2.9' not found (required by /scratch1/NCEPDEV/global/spack-stack/spack-stack-v1/envs/hpc-stack-dev-gnu-9.2.0/install/gcc/9.2.0/libpng-1.6.37-7jlo63z/lib/libpng16.so.16)
Hang-Lei-NOAA commented 1 year ago

We do not have openmpi versions installed. Could you please create a ticket in hpc-stack, if it is required on hera. I can start to develop these sets of libs.

On Mon, Oct 17, 2022 at 4:11 PM Dusan Jovic @.***> wrote:

The full answer is here: https://spack-stack.readthedocs.io/en/latest/Platforms.html#noaa-rdhpcs-hera

Note that there are also two other build environments that match the hpc-stack version of the libraries in /scratch1/NCEPDEV/global/spack-stack/spack-stack-v1/envs/hpc-stack-dev-gnu-9.2.0 and /scratch1/NCEPDEV/global/spack-stack/spack-stack-v1/envs/hpc-stack-dev-intel-2022.0.2/ .

Note also that @mark-a-potts https://github.com/mark-a-potts and @AlexanderRichert-NOAA https://github.com/AlexanderRichert-NOAA are testing the ufs-weather-model with these software stacks.

I tried to build and run the control regression test using /scratch2/NCEPDEV/nwprod/hpc-stack/libs/hpc-stack/modulefiles/stack. I addition to using this stack and loading stack-gcc/9.2.0 and stack-openmpi/3.1.4 I had to rename two modules in ufs_common (netcdf to netcdf-c and pio to parallelio) and remove one (gftl-shared) which apparently is missing. Model then compiled successfully, but unfortunately it crashes with this error:

85: /scratch1/NCEPDEV/stmp2/Dusan.Jovic/FV3_RT/rt_40700/control/./fv3.exe: /usr/lib64/libz.so.1: version `ZLIB_1.2.9' not found (required by /scratch1/NCEPDEV/global/spack-stack/spack-stack-v1/envs/hpc-stack-dev-gnu-9.2.0/install/gcc/9.2.0/libpng-1.6.37-7jlo63z/lib/libpng16.so.16)

— Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-weather-model/pull/1147#issuecomment-1281429465, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKWSMFE7ZBWPL6MOPXWLK2TWDWXFNANCNFSM5R6RO57Q . You are receiving this because you were mentioned.Message ID: @.***>

DusanJovic-NOAA commented 1 year ago

I also tried this stack /glade/work/jedipara/cheyenne/spack-stack/spack-stack-v1/envs/skylab-2.0.0-intel-19.1.1.217/install/modulefiles/Core, as described in redathedocs link in the above comment. In order to use this stack I had to update versions of several libraries (zlib, jasper, hdf5, netcdf) in addition to renaming two modules listed above. But then one module is missing w3emc/2.9.2

It will be difficult to transition to spack-stack from hpc-stack with so many inconsistencies between the two, in module names, module versions, the content of the stacks etc.

So in this PR I'm going to focus to get hpc-stack working. If possible.

climbfuji commented 1 year ago

That’s why we created the “other” hpc-stack-dev environments (in addition to the skylab-2.0.0 envs) on hera for transitioning ...

On Oct 17, 2022, at 2:46 PM, Dusan Jovic @.***> wrote:

I also tried this stack /glade/work/jedipara/cheyenne/spack-stack/spack-stack-v1/envs/skylab-2.0.0-intel-19.1.1.217/install/modulefiles/Core, as described in redathedocs link in the above comment. In order to use this stack I had to update versions of several libraries (zlib, jasper, hdf5, netcdf) in addition to renaming two modules listed above. But then one module is missing w3emc/2.9.2

It will be difficult to transition to spack-stack from hpc-stack with so many inconsistencies between the two, in module names, module versions, the content of the stacks etc.

So in this PR I'm going to focus to get hpc-stack working. If possible.

— Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-weather-model/pull/1147#issuecomment-1281474186, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB5C2RKQHCY745LDFTQS7LDWDW3JNANCNFSM5R6RO57Q. You are receiving this because you were mentioned.

DusanJovic-NOAA commented 1 year ago

@BrianCurtis-NOAA tried to run this branch on Cheyenne using gnu compiler and mpt, and it failed to compile. I suspect we'll need to switch from mpt to openmpi.

@jkbk2004 Can somebody from EPIC build gnu/openmpi stack on cheyenne. Thanks.

climbfuji commented 1 year ago

@BrianCurtis-NOAA tried to run this branch on Cheyenne using gnu compiler and mpt, and it failed to compile. I suspect we'll need to switch from mpt to openmpi.

@jkbk2004 Can somebody from EPIC build gnu/openmpi stack on cheyenne. Thanks.

It's really a pity that we didn't coordinate this PR with switching to spack-stack - spack-stack is on Cheyenne and should be able to support the ufs-weather-model. It uses Intel + Intel-MPI and GNU + OpenMPI.

DusanJovic-NOAA commented 1 year ago

I do not have access to cheyenne, so I can not run any tests. But it looks like the gnu/openmpi hpc-stack on hera is finally ready (see https://github.com/ufs-community/ufs-weather-model/issues/1465), so I do not see why we can not switch to gnu/openmpi hpc-stack on Hera and gnu/openmpi spack-stack on Cheyenne and finally get this PR merged.

BrianCurtis-NOAA commented 1 year ago

Command: CMAKE_FLAGS="-DAPP=S2SWA -DCCPP_SUITES=FV3_GFS_v17_coupled_p8,FV3_GFS_cpld_rasmgshocnsstnoahmp_ugwp" ./build.sh

module list:

Currently Loaded Modules:
  1) cmake/3.22.0        7) ncarcompilers/0.5.0  13) libpng/1.6.37  19) bacio/2.4.1    25) w3emc/2.9.2
  2) miniconda3/4.12.0   8) hpc/1.2.0            14) hdf5/1.10.6    20) crtm/2.4.0     26) gftl-shared/v1.5.0
  3) python/3.7.9        9) hpc-gnu/11.2.0       15) netcdf/4.7.4   21) g2/3.4.5       27) mapl/2.22.0-esmf-8.3.0b09
  4) ncarenv/1.3        10) hpc-mpt/2.25         16) pio/2.5.7      22) g2tmpl/1.10.0  28) ufs_common
  5) gnu/11.2.0         11) jasper/2.0.25        17) esmf/8.3.0b09  23) ip/3.3.3       29) ufs_cheyenne.gnu
  6) mpt/2.25           12) zlib/1.2.11          18) fms/2022.01    24) sp/2.3.3

Output at linking stage:

[100%] Linking Fortran executable ufs_model
/usr/bin/ld: /usr/bin/ld: DWARF error: can't find .debug_ranges section.
FV3/libfv3atm.a(module_fcst_grid_comp.F90.o): in function `__module_fcst_grid_comp_MOD_fcst_initialize':
module_fcst_grid_comp.F90:(.text+0x2711): undefined reference to `sgi_mpi_f08_integer'
/usr/bin/ld: module_fcst_grid_comp.F90:(.text+0x277d): undefined reference to `mpi_allgather_f08ts_'
/usr/bin/ld: module_fcst_grid_comp.F90:(.text+0x27c0): undefined reference to `sgi_mpi_f08_logical'
/usr/bin/ld: module_fcst_grid_comp.F90:(.text+0x27e1): undefined reference to `mpi_allgather_f08ts_'
/usr/bin/ld: module_fcst_grid_comp.F90:(.text+0x4675): undefined reference to `sgi_mpi_f08_logical'
/usr/bin/ld: module_fcst_grid_comp.F90:(.text+0x46b6): undefined reference to `mpi_bcast_f08ts_'
/usr/bin/ld: /usr/bin/ld: DWARF error: can't find .debug_ranges section.
FV3/libfv3atm.a(module_wrt_grid_comp.F90.o): in function `__module_wrt_grid_comp_MOD_wrt_run':
module_wrt_grid_comp.F90:(.text+0x126b4): undefined reference to `sgi_mpi_f08_integer'
/usr/bin/ld: module_wrt_grid_comp.F90:(.text+0x126c1): undefined reference to `mpi_bcast_f08ts_'
/usr/bin/ld: module_wrt_grid_comp.F90:(.text+0x12741): undefined reference to `sgi_mpi_f08_integer'
/usr/bin/ld: module_wrt_grid_comp.F90:(.text+0x1274e): undefined reference to `mpi_bcast_f08ts_'
/usr/bin/ld: module_wrt_grid_comp.F90:(.text+0x127ce): undefined reference to `sgi_mpi_f08_integer'
/usr/bin/ld: module_wrt_grid_comp.F90:(.text+0x127db): undefined reference to `mpi_bcast_f08ts_'
/usr/bin/ld: module_wrt_grid_comp.F90:(.text+0x1285b): undefined reference to `sgi_mpi_f08_integer'
/usr/bin/ld: module_wrt_grid_comp.F90:(.text+0x12868): undefined reference to `mpi_bcast_f08ts_'
/usr/bin/ld: module_wrt_grid_comp.F90:(.text+0x12c52): undefined reference to `sgi_mpi_f08_integer'
/usr/bin/ld: module_wrt_grid_comp.F90:(.text+0x12c5f): undefined reference to `mpi_bcast_f08ts_'
/usr/bin/ld: FV3/libfv3atm.a(module_wrt_grid_comp.F90.o): in function `__module_wrt_grid_comp_MOD_wrt_initialize_p1':
module_wrt_grid_comp.F90:(.text+0x13f9b): undefined reference to `mpi_comm_dup_f08_'
/usr/bin/ld: module_wrt_grid_comp.F90:(.text+0x172f3): undefined reference to `sgi_mpi_f08_integer'
/usr/bin/ld: module_wrt_grid_comp.F90:(.text+0x17314): undefined reference to `mpi_allgather_f08ts_'
/usr/bin/ld: module_wrt_grid_comp.F90:(.text+0x1737b): undefined reference to `sgi_mpi_f08_integer'
/usr/bin/ld: module_wrt_grid_comp.F90:(.text+0x1739c): undefined reference to `mpi_allgather_f08ts_'
/usr/bin/ld: module_wrt_grid_comp.F90:(.text+0x17403): undefined reference to `sgi_mpi_f08_integer'
/usr/bin/ld: module_wrt_grid_comp.F90:(.text+0x17424): undefined reference to `mpi_allgather_f08ts_'
/usr/bin/ld: module_wrt_grid_comp.F90:(.text+0x17489): undefined reference to `sgi_mpi_f08_integer'
/usr/bin/ld: module_wrt_grid_comp.F90:(.text+0x174aa): undefined reference to `mpi_allgather_f08ts_'
/usr/bin/ld: FV3/libfv3atm.a(post_fv3.F90.o): in function `__post_fv3_MOD_post_run_fv3':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/FV3/io/post_fv3.F90:209: undefined reference to `mpi_barrier_f08_'
/usr/bin/ld: /usr/bin/ld: DWARF error: can't find .debug_ranges section.
FV3/libfv3atm.a(atmos_model.F90.o): in function `__atmos_model_mod_MOD_update_atmos_radiation_physics':
atmos_model.F90:(.text+0x2695d): undefined reference to `sgi_mpi_f08_sum'
/usr/bin/ld: atmos_model.F90:(.text+0x2696a): undefined reference to `sgi_mpi_f08_double_precision'
/usr/bin/ld: atmos_model.F90:(.text+0x26a27): undefined reference to `mpi_allreduce_f08ts_'
/usr/bin/ld: atmos_model.F90:(.text+0x26a42): undefined reference to `sgi_mpi_f08_maxloc'
/usr/bin/ld: atmos_model.F90:(.text+0x26a5a): undefined reference to `sgi_mpi_f08_2double_precision'
/usr/bin/ld: atmos_model.F90:(.text+0x26b53): undefined reference to `mpi_allreduce_f08ts_'
/usr/bin/ld: atmos_model.F90:(.text+0x26bc4): undefined reference to `sgi_mpi_f08_double_precision'
/usr/bin/ld: atmos_model.F90:(.text+0x26bde): undefined reference to `mpi_bcast_f08ts_'
/usr/bin/ld: /usr/bin/ld: DWARF error: can't find .debug_ranges section.
FV3/libfv3atm.a(module_write_netcdf.F90.o): in function `__module_write_netcdf_MOD_write_netcdf':
module_write_netcdf.F90:(.text+0x51e3): undefined reference to `sgi_mpi_f08_info_null'
/usr/bin/ld: module_write_netcdf.F90:(.text+0x8f6c): undefined reference to `sgi_mpi_f08_max'
/usr/bin/ld: module_write_netcdf.F90:(.text+0x8f7d): undefined reference to `sgi_mpi_f08_real4'
/usr/bin/ld: module_write_netcdf.F90:(.text+0x9005): undefined reference to `mpi_allreduce_f08ts_'
/usr/bin/ld: module_write_netcdf.F90:(.text+0x9012): undefined reference to `sgi_mpi_f08_real4'
/usr/bin/ld: module_write_netcdf.F90:(.text+0x9028): undefined reference to `sgi_mpi_f08_min'
/usr/bin/ld: module_write_netcdf.F90:(.text+0x90b1): undefined reference to `mpi_allreduce_f08ts_'
/usr/bin/ld: module_write_netcdf.F90:(.text+0x90fb): undefined reference to `sgi_mpi_f08_max'
/usr/bin/ld: module_write_netcdf.F90:(.text+0x9100): undefined reference to `sgi_mpi_f08_real4'
/usr/bin/ld: module_write_netcdf.F90:(.text+0x918e): undefined reference to `mpi_allreduce_f08ts_'
/usr/bin/ld: /usr/bin/ld: DWARF error: can't find .debug_ranges section.
FV3/ccpp/physics/libccpp_physics.a(module_mp_thompson.F90.o): in function `__module_mp_thompson_MOD_freezeh2o':
module_mp_thompson.F90:(.text+0x5d97): undefined reference to `mpi_barrier_f08_'
/usr/bin/ld: FV3/ccpp/physics/libccpp_physics.a(module_mp_thompson.F90.o): in function `__module_mp_thompson_MOD_qr_acr_qs':
module_mp_thompson.F90:(.text+0x6737): undefined reference to `mpi_barrier_f08_'
/usr/bin/ld: FV3/ccpp/physics/libccpp_physics.a(module_mp_thompson.F90.o): in function `__module_mp_thompson_MOD_qr_acr_qg':
module_mp_thompson.F90:(.text+0x87d7): undefined reference to `mpi_barrier_f08_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_alltoall_r4_1darr':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:710: undefined reference to `sgi_mpi_f08_real'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:710: undefined reference to `sgi_mpi_f08_real'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:710: undefined reference to `mpi_alltoallv_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_reduce_sum_i8':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:685: undefined reference to `sgi_mpi_f08_sum'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:685: undefined reference to `sgi_mpi_f08_integer8'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:685: undefined reference to `mpi_allreduce_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_reduce_sum_i':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:660: undefined reference to `sgi_mpi_f08_sum'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:660: undefined reference to `sgi_mpi_f08_integer'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:660: undefined reference to `mpi_allreduce_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_reduce_sum_r8_2darr':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:640: undefined reference to `sgi_mpi_f08_sum'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:640: undefined reference to `sgi_mpi_f08_double_precision'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:640: undefined reference to `mpi_allreduce_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_reduce_sum_r8_1darr':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:616: undefined reference to `sgi_mpi_f08_double_precision'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:616: undefined reference to `sgi_mpi_f08_sum'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:616: undefined reference to `mpi_allreduce_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_reduce_sum_r8_1d':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:594: undefined reference to `sgi_mpi_f08_sum'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:594: undefined reference to `sgi_mpi_f08_double_precision'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:594: undefined reference to `mpi_allreduce_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_reduce_sum_r4_1d':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:566: undefined reference to `sgi_mpi_f08_sum'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:566: undefined reference to `sgi_mpi_f08_double_precision'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:566: undefined reference to `mpi_allreduce_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_reduce_sum_r4_2darr':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:537: undefined reference to `sgi_mpi_f08_sum'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:537: undefined reference to `sgi_mpi_f08_real'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:537: undefined reference to `mpi_allreduce_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_reduce_sum_r4_1darr':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:514: undefined reference to `sgi_mpi_f08_real'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:514: undefined reference to `sgi_mpi_f08_sum'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:514: undefined reference to `mpi_allreduce_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_reduce_sum_r8':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:491: undefined reference to `sgi_mpi_f08_sum'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:491: undefined reference to `sgi_mpi_f08_double_precision'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:491: undefined reference to `mpi_allreduce_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_reduce_sum_r4':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:471: undefined reference to `sgi_mpi_f08_sum'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:471: undefined reference to `sgi_mpi_f08_real'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:471: undefined reference to `mpi_allreduce_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_reduce_max_i':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:451: undefined reference to `sgi_mpi_f08_max'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:451: undefined reference to `sgi_mpi_f08_integer'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:451: undefined reference to `mpi_allreduce_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_reduce_min_r8':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:431: undefined reference to `sgi_mpi_f08_min'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:431: undefined reference to `sgi_mpi_f08_double_precision'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:431: undefined reference to `mpi_allreduce_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_reduce_min_r4':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:419: undefined reference to `sgi_mpi_f08_min'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:419: undefined reference to `sgi_mpi_f08_real'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:419: undefined reference to `mpi_allreduce_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_reduce_max_r8':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:407: undefined reference to `sgi_mpi_f08_max'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:407: undefined reference to `sgi_mpi_f08_double_precision'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:407: undefined reference to `mpi_allreduce_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_reduce_max_r4':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:390: undefined reference to `sgi_mpi_f08_max'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:390: undefined reference to `sgi_mpi_f08_real'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:390: undefined reference to `mpi_allreduce_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_reduce_max_r8_1d':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:370: undefined reference to `sgi_mpi_f08_max'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:370: undefined reference to `sgi_mpi_f08_double_precision'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:370: undefined reference to `mpi_allreduce_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_reduce_max_r4_1d':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:349: undefined reference to `sgi_mpi_f08_max'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:349: undefined reference to `sgi_mpi_f08_real'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:349: undefined reference to `mpi_allreduce_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_bcst_4d_i':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:330: undefined reference to `sgi_mpi_f08_integer'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:330: undefined reference to `mpi_bcast_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_bcst_2d_i':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:314: undefined reference to `sgi_mpi_f08_integer'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:314: undefined reference to `mpi_bcast_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_bcst_1d_i':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:298: undefined reference to `sgi_mpi_f08_integer'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:298: undefined reference to `mpi_bcast_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_bcst_3d_i':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:282: undefined reference to `sgi_mpi_f08_integer'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:282: undefined reference to `mpi_bcast_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_bcst_4d_r8':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:266: undefined reference to `sgi_mpi_f08_double_precision'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:266: undefined reference to `mpi_bcast_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_bcst_4d_r4':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:250: undefined reference to `sgi_mpi_f08_real'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:250: undefined reference to `mpi_bcast_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_bcst_3d_r8':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:234: undefined reference to `sgi_mpi_f08_double_precision'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:234: undefined reference to `mpi_bcast_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_bcst_3d_r4':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:218: undefined reference to `sgi_mpi_f08_real'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:218: undefined reference to `mpi_bcast_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_bcst_2d_r8':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:202: undefined reference to `sgi_mpi_f08_double_precision'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:202: undefined reference to `mpi_bcast_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_bcst_2d_r4':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:186: undefined reference to `sgi_mpi_f08_real'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:186: undefined reference to `mpi_bcast_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_bcst_1d_r8':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:170: undefined reference to `sgi_mpi_f08_double_precision'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:170: undefined reference to `mpi_bcast_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_bcst_1d_r4':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:154: undefined reference to `sgi_mpi_f08_real'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:154: undefined reference to `mpi_bcast_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_bcst_r8':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:138: undefined reference to `sgi_mpi_f08_double_precision'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:138: undefined reference to `mpi_bcast_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_bcst_r4':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:123: undefined reference to `sgi_mpi_f08_real'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:123: undefined reference to `mpi_bcast_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_bcst_i':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:108: undefined reference to `sgi_mpi_f08_integer'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:108: undefined reference to `mpi_bcast_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mpi_wrapper_initialize':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:86: undefined reference to `mpi_comm_rank_f08_'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:87: undefined reference to `mpi_comm_size_f08_'
collect2: error: ld returned 1 exit status
CMakeFiles/ufs_model.dir/build.make:155: recipe for target 'ufs_model' failed
make[2]: *** [ufs_model] Error 1
CMakeFiles/Makefile2:773: recipe for target 'CMakeFiles/ufs_model.dir/all' failed
make[1]: *** [CMakeFiles/ufs_model.dir/all] Error 2
Makefile:135: recipe for target 'all' failed
make: *** [all] Error 2
climbfuji commented 1 year ago

I do not have access to cheyenne, so I can not run any tests. But it looks like the gnu/openmpi hpc-stack on hera is finally ready (see #1465), so I do not see why we can not switch to gnu/openmpi hpc-stack on Hera and gnu/openmpi spack-stack on Cheyenne and finally get this PR merged.

Yes, I hear you ... not exactly my job anymore, but I am going to give it a shot on Cheyenne. I will need to extend the existing environment with a few missing packages for the fully-coupled UFS. Then change the build environment for Cheyenne with GNU and see what happens ...

ulmononian commented 1 year ago

@BrianCurtis-NOAA tried to run this branch on Cheyenne using gnu compiler and mpt, and it failed to compile. I suspect we'll need to switch from mpt to openmpi.

@jkbk2004 Can somebody from EPIC build gnu/openmpi stack on cheyenne. Thanks.

i can install gnu/openmpi in the official space on cheyenne, unless @climbfuji has already done so. @DusanJovic-NOAA : can you confirm which gnu/openmpi you prefer? there is much more flexibility for combinations out-of-the-box on cheyenne.

DusanJovic-NOAA commented 1 year ago

Any supported gnu compiler version >= 9, and any supported opemnpi. I do not have any preference.

climbfuji commented 1 year ago

I was able to compile the UFS on Cheyenne using spack-stack with gcc-10.1.0 and openmpi-4.1.1.

I ran rt.sh for the control test first, and it passed against the existing baseline:

...
+ cat /glade/scratch/heinzell/ufs-weather-model-dusan-no-arg-mismatch-spack-stack/tests/log_cheyenne.gnu/compile_001_time.log
+ cat /glade/scratch/heinzell/ufs-weather-model-dusan-no-arg-mismatch-spack-stack/tests/log_cheyenne.gnu/rt_001_control.log
+ FILES='fail_test_* fail_compile_*'
+ for f in '$FILES'
+ [[ -f fail_test_* ]]
+ for f in '$FILES'
+ [[ -f fail_compile_* ]]
+ [[ -e fail_test ]]
+ echo

+ echo REGRESSION TEST WAS SUCCESSFUL
REGRESSION TEST WAS SUCCESSFUL
+ echo
+ echo REGRESSION TEST WAS SUCCESSFUL
+ rm -f 'fv3_*.x' fv3_001.exe modules.fv3

It requires small changes in the ufs-weather-model which I would be happy to contribute to this PR, but we need a clear path forward with the remaining applications and how to manage this spack-stack installation.

junwang-noaa commented 1 year ago

@DusanJovic-NOAA Do we need to change the ufs_hera.gnu.lua and ufs_hera.gnu_debug.lua too?

DusanJovic-NOAA commented 1 year ago

@DusanJovic-NOAA Do we need to change the ufs_hera.gnu.lua and ufs_hera.gnu_debug.lua too?

Both hera and cheyenne gnu module files will eventually need to be changed to use updated gnu and openmpi modules. But I'm not sure if that's ready at this moment. Ideally we should update to the new mpi library in a separate commit so that this PR only updates the code without needing new baselines.

DusanJovic-NOAA commented 1 year ago

Is there any progress in installing gnu/openmpi libraries on Hera?

ulmononian commented 1 year ago

Is there any progress in installing gnu/openmpi libraries on Hera?

spack-stack/1.3.1 is installed on hera and features gnu/9.2.0 and openmpi/4.1.5. it will hopefully be available in WM develop in the near future (see #1707). you can test it with this modulefile (or just take the paths), if you want: https://github.com/ulmononian/ufs-weather-model/blob/feature/spack_stack_ue/modulefiles/ufs_hera.gnu.lua. note, however, that you'll also need to use this updated ufs_common https://github.com/ulmononian/ufs-weather-model/blob/feature/spack_stack_ue/modulefiles/ufs_common.lua.

DusanJovic-NOAA commented 1 year ago

Is there any progress in installing gnu/openmpi libraries on Hera?

spack-stack/1.3.1 is installed on hera and features gnu/9.2.0 and openmpi/4.1.5. it will hopefully be available in WM develop in the near future (see #1707). you can test it with this modulefile (or just take the paths), if you want: https://github.com/ulmononian/ufs-weather-model/blob/feature/spack_stack_ue/modulefiles/ufs_hera.gnu.lua. note, however, that you'll also need to use this updated ufs_common https://github.com/ulmononian/ufs-weather-model/blob/feature/spack_stack_ue/modulefiles/ufs_common.lua.

Thanks. I tested this branch using the modulefiles from your PR on Hera, and all tests compiled successfully and finished without failure. But I see several tests failed output comparison against the current baselines which I think is expected.

$ cat fail_test control_stochy 002 failed in check_result control_stochy 002 failed in run_test control_ras 003 failed in check_result control_ras 003 failed in run_test control_flake 005 failed in check_result control_flake 005 failed in run_test control_diag_debug 023 failed in check_result control_diag_debug 023 failed in run_test rap_noah_sfcdiff_cires_ugwp_debug 028 failed in check_result rap_noah_sfcdiff_cires_ugwp_debug 028 failed in run_test control_ras_debug 031 failed in check_result control_ras_debug 031 failed in run_test control_stochy_debug 032 failed in check_result control_stochy_debug 032 failed in run_test cpld_control_p8 051 failed in check_result cpld_control_p8 051 failed in run_test cpld_debug_p8 053 failed in check_result cpld_debug_p8 053 failed in run_test

junwang-noaa commented 9 months ago

@DusanJovic-NOAA Can you provide an update on this?

DusanJovic-NOAA commented 9 months ago

Full regression test passed on Hera. RegressionTests_hera.log

climbfuji commented 9 months ago

Yay. This has been a long time coming!

jkbk2004 commented 9 months ago

@DusanJovic-NOAA We see a progress to set gnu baseline on hercules with #1733. I wonder if we can combine gnu type mismatch build option there? Please, feel free to leave comments in #1733.

DusanJovic-NOAA commented 9 months ago

@DusanJovic-NOAA We see a progress to set gnu baseline on hercules with #1733. I wonder if we can combine gnu type mismatch build option there? Please, feel free to leave comments in #1733.

I tried running one gnu regression test on Hercules using a branch form #1733 :

./rt.sh -n control_p8 gnu

but it looks like the regression test on Hercules is still not setup correctly:

+ echo 'control_p8 does not exist or cannot be run on hercules'
control_p8 does not exist or cannot be run on hercules
+ exit 1
+ echo 'rt.sh finished'
rt.sh finished
ulmononian commented 9 months ago

@DusanJovic-NOAA We see a progress to set gnu baseline on hercules with #1733. I wonder if we can combine gnu type mismatch build option there? Please, feel free to leave comments in #1733.

I tried running one gnu regression test on Hercules using a branch form #1733 :

./rt.sh -n control_p8 gnu

but it looks like the regression test on Hercules is still not setup correctly:

+ echo 'control_p8 does not exist or cannot be run on hercules'
control_p8 does not exist or cannot be run on hercules
+ exit 1
+ echo 'rt.sh finished'
rt.sh finished

sorry about that @DusanJovic-NOAA -- i forgot to commit the updated rt.conf. please try again now.

DusanJovic-NOAA commented 9 months ago

I pulled the changes from #1733, and I successfully compiled all tests however several tests failed to reproduce current baselines. After I created new baselines all tests succeeded.

Somebody will need to run this PR on Cheyenne to make sure gnu compiler/mpi can build all configurations correctly.

DusanJovic-NOAA commented 7 months ago

When I run the regression test using this branch on Hercules with the GNU compiler it fails in control_p8 test with the following error:

140: At line 831 of file /work/noaa/fv3-cam/djovic/ufs/no_arg_mismatch/ufs-weather-model/FV3/module_fcst_grid_comp.F90
140: Fortran runtime error: Index '144' of dimension 1 of array 'grid_number_on_all_pets' above upper bound of 143
140: 
140: Error termination. Backtrace:
140: #0  0x14c76f16c860 in ???
140: #1  0x14c76f16d3b9 in ???
140: #2  0x14c76f16da2d in ???
140: #3  0x11f8085 in fcst_initialize
140:    at /work/noaa/fv3-cam/djovic/ufs/no_arg_mismatch/ufs-weather-model/FV3/module_fcst_grid_comp.F90:831
140: #4  0x988fd4 in ???
140: #5  0x989348 in ???

This should not happen. I found that the lower and upper bound of the array grid_number_on_all_pets is changed after the mpi_allgather call.

Consider the following test program:

$ cat mpi_allgather_test.f90 
program mpi_allgather_test

  use mpi_f08
  implicit none

  integer :: ierr
  character(len=MPI_MAX_LIBRARY_VERSION_STRING) :: version
  integer :: resultlen
  integer :: mype, nproc
  integer, allocatable :: arr(:)

  call MPI_Init(ierr)
  call MPI_Comm_size(MPI_COMM_WORLD, nproc, ierr)
  call MPI_Comm_rank(MPI_COMM_WORLD, mype, ierr)

  call MPI_Get_library_version(version, resultlen, ierr)
  if (mype == 0) write(*,'(A)') version(1:resultlen)

  allocate (arr(nproc))
  if(mype==0) write(*,*)'nproc', nproc, ' size=', size(arr), ' lbound=', lbound(arr), ' ubound=', ubound(arr)
  arr = 0

  call mpi_allgather(mype, 1, MPI_INTEGER, arr, 1, MPI_INTEGER, MPI_COMM_WORLD, ierr)
  if(mype==0) write(*,*)'nproc', nproc, ' size=', size(arr), ' lbound=', lbound(arr), ' ubound=', ubound(arr)

  call MPI_Finalize(ierr)

end program mpi_allgather_test

Here array arr is allocate to have the size equal to the number of mpi tasks. Lower bound should be 1 and upper bound should be nproc.

When I compile this program using mvapich2/2.3.7 I get:

$ module purge
$ module use /work/noaa/epic/role-epic/spack-stack/hercules/modulefiles
$ module load mvapich2/2.3.7
$ ml

Currently Loaded Modules:
  1) slurm/22.05.8   2) mvapich2/2.3.7

$ mpif90 mpi_allgather_test.f90 
$ srun -n 4 ./a.out 
MVAPICH2 Version      : 2.3.7
MVAPICH2 Release date : Wed March 02 22:00:00 EST 2022
MVAPICH2 Device       : ch3:mrail
MVAPICH2 configure    : --prefix=/work/noaa/epic/role-epic/spack-stack/hercules/mvapich2-2.3.7/gcc-11.3.1 --with-pmi=pmi2 --with-pm=slurm --with-slurm-include=/opt/slurm-22.05.8/include --with-slurm-lib=/opt/slurm-22.05.8/lib
MVAPICH2 CC           : gcc    -DNDEBUG -DNVALGRIND -O2
MVAPICH2 CXX          : g++   -DNDEBUG -DNVALGRIND -O2
MVAPICH2 F77          : gfortran -fallow-argument-mismatch  -O2
MVAPICH2 FC           : gfortran   -O2

 nproc           4  size=           4  lbound=           1  ubound=           4
 nproc           4  size=           4  lbound=           0  ubound=           3

which is clearly incorrect, the lower and upper bounds of the arr should be 1 and 4 after the mpi_allgather call.

Using openmpi/4.1.4 and gcc/12.2.0 I see correct lower and upper bounds before and after mpi_allgather call:

$ module purge
$ module load gcc/12.2.0
$ module load openmpi/4.1.4

$ ml

Currently Loaded Modules:
  1) zlib/1.2.13   2) gcc/12.2.0   3) openmpi/4.1.4

$ mpif90 mpi_allgather_test.f90 
$ srun -n 4 ./a.out 
Open MPI v4.1.4, package: Open MPI jhrogers@hercules-devel-1.hpc.msstate.edu Distribution, ident: 4.1.4, repo rev: v4.1.4, May 26, 2022 
 nproc           4  size=           4  lbound=           1  ubound=           4
 nproc           4  size=           4  lbound=           1  ubound=           4
climbfuji commented 7 months ago

When I run the regression test using this branch on Hercules with the GNU compiler it fails in control_p8 test with the following error:

140: At line 831 of file /work/noaa/fv3-cam/djovic/ufs/no_arg_mismatch/ufs-weather-model/FV3/module_fcst_grid_comp.F90
140: Fortran runtime error: Index '144' of dimension 1 of array 'grid_number_on_all_pets' above upper bound of 143
140: 
140: Error termination. Backtrace:
140: #0  0x14c76f16c860 in ???
140: #1  0x14c76f16d3b9 in ???
140: #2  0x14c76f16da2d in ???
140: #3  0x11f8085 in fcst_initialize
140:    at /work/noaa/fv3-cam/djovic/ufs/no_arg_mismatch/ufs-weather-model/FV3/module_fcst_grid_comp.F90:831
140: #4  0x988fd4 in ???
140: #5  0x989348 in ???

This should not happen. I found that the lower and upper bound of the array grid_number_on_all_pets is changed after the mpi_allgather call.

Consider the following test program:

$ cat mpi_allgather_test.f90 
program mpi_allgather_test

  use mpi_f08
  implicit none

  integer :: ierr
  character(len=MPI_MAX_LIBRARY_VERSION_STRING) :: version
  integer :: resultlen
  integer :: mype, nproc
  integer, allocatable :: arr(:)

  call MPI_Init(ierr)
  call MPI_Comm_size(MPI_COMM_WORLD, nproc, ierr)
  call MPI_Comm_rank(MPI_COMM_WORLD, mype, ierr)

  call MPI_Get_library_version(version, resultlen, ierr)
  if (mype == 0) write(*,'(A)') version(1:resultlen)

  allocate (arr(nproc))
  if(mype==0) write(*,*)'nproc', nproc, ' size=', size(arr), ' lbound=', lbound(arr), ' ubound=', ubound(arr)
  arr = 0

  call mpi_allgather(mype, 1, MPI_INTEGER, arr, 1, MPI_INTEGER, MPI_COMM_WORLD, ierr)
  if(mype==0) write(*,*)'nproc', nproc, ' size=', size(arr), ' lbound=', lbound(arr), ' ubound=', ubound(arr)

  call MPI_Finalize(ierr)

end program mpi_allgather_test

Here array arr is allocate to have the size equal to the number of mpi tasks. Lower bound should be 1 and upper bound should be nproc.

When I compile this program using mvapich2/2.3.7 I get:

$ module purge
$ module use /work/noaa/epic/role-epic/spack-stack/hercules/modulefiles
$ module load mvapich2/2.3.7
$ ml

Currently Loaded Modules:
  1) slurm/22.05.8   2) mvapich2/2.3.7

$ mpif90 mpi_allgather_test.f90 
$ srun -n 4 ./a.out 
MVAPICH2 Version      : 2.3.7
MVAPICH2 Release date : Wed March 02 22:00:00 EST 2022
MVAPICH2 Device       : ch3:mrail
MVAPICH2 configure    : --prefix=/work/noaa/epic/role-epic/spack-stack/hercules/mvapich2-2.3.7/gcc-11.3.1 --with-pmi=pmi2 --with-pm=slurm --with-slurm-include=/opt/slurm-22.05.8/include --with-slurm-lib=/opt/slurm-22.05.8/lib
MVAPICH2 CC           : gcc    -DNDEBUG -DNVALGRIND -O2
MVAPICH2 CXX          : g++   -DNDEBUG -DNVALGRIND -O2
MVAPICH2 F77          : gfortran -fallow-argument-mismatch  -O2
MVAPICH2 FC           : gfortran   -O2

 nproc           4  size=           4  lbound=           1  ubound=           4
 nproc           4  size=           4  lbound=           0  ubound=           3

which is clearly incorrect, the lower and upper bounds of the arr should be 1 and 4 after the mpi_allgather call.

Using openmpi/4.1.4 and gcc/12.2.0 I see correct lower and upper bounds before and after mpi_allgather call:

$ module purge
$ module load gcc/12.2.0
$ module load openmpi/4.1.4

$ ml

Currently Loaded Modules:
  1) zlib/1.2.13   2) gcc/12.2.0   3) openmpi/4.1.4

$ mpif90 mpi_allgather_test.f90 
$ srun -n 4 ./a.out 
Open MPI v4.1.4, package: Open MPI jhrogers@hercules-devel-1.hpc.msstate.edu Distribution, ident: 4.1.4, repo rev: v4.1.4, May 26, 2022 
 nproc           4  size=           4  lbound=           1  ubound=           4
 nproc           4  size=           4  lbound=           1  ubound=           4

Pah, this is very annoying. I guess the fastest way forward is to move up from gcc@11 to gcc@12 and use the (hopefully) working openmpi.

climbfuji commented 7 months ago

@DusanJovic-NOAA @BrianCurtis-NOAA I am building spack-stack@1.5.1 with gcc@12 and openmpi@4.1.4 now. spack-stack 1.5.1 comes with the new ESMF and MAPL, therefore this PR here would have to wait for the esmf-mapl update (move to spack-stack 1.5.1), at least on Hercules. Is that acceptable?

junwang-noaa commented 7 months ago

@climbfuji please build with with fms 2023.02.01 too, thank you!

DusanJovic-NOAA commented 7 months ago

@DusanJovic-NOAA @BrianCurtis-NOAA I am building spack-stack@1.5.1 with gcc@12 and openmpi@4.1.4 now. spack-stack 1.5.1 comes with the new ESMF and MAPL, therefore this PR here would have to wait for the esmf-mapl update (move to spack-stack 1.5.1), at least on Hercules. Is that acceptable?

Sure. Hopefully gcc@12 and openmpi@4.1.4 will finally allow us to move forward with this PR.

climbfuji commented 7 months ago

fwiw, I am getting a compile error for cdo (part of the unified environment) on Hercules with gcc@12 (I didn't get that on Derecho with the same compiler). I'll see how to best around this.

stream_gribapi.c: In function 'gribapiVarCompare.isra':
stream_gribapi.c:852:1: internal compiler error: in classify_argument, at config/i386/i386.cc:2388
  852 | gribapiVarCompare(compvar2_t compVar, record_t record, int flag)
      | ^~~~~~~~~~~~~~~~~
mv -f .deps/iterator_grib.Tpo .deps/iterator_grib.Plo
0x67e958 classify_argument
        /tmp/leahb/spack-stage/spack-stage-gcc-12.2.0-7cu3qahzhsxpauy4jlnsbcqmlbkxbbbo/spack-src/gcc/config/i386/i386.cc:2388
0xef9c56 classify_argument
        /tmp/leahb/spack-stage/spack-stage-gcc-12.2.0-7cu3qahzhsxpauy4jlnsbcqmlbkxbbbo/spack-src/gcc/config/i386/i386.cc:2528
0xefa63c construct_container
        /tmp/leahb/spack-stage/spack-stage-gcc-12.2.0-7cu3qahzhsxpauy4jlnsbcqmlbkxbbbo/spack-src/gcc/config/i386/i386.cc:2625
0xefaee8 function_arg_64
        /tmp/leahb/spack-stage/spack-stage-gcc-12.2.0-7cu3qahzhsxpauy4jlnsbcqmlbkxbbbo/spack-src/gcc/config/i386/i386.cc:3283
0xefaee8 ix86_function_arg
        /tmp/leahb/spack-stage/spack-stage-gcc-12.2.0-7cu3qahzhsxpauy4jlnsbcqmlbkxbbbo/spack-src/gcc/config/i386/i386.cc:3397
0x92d5fd assign_parm_find_entry_rtl
        /tmp/leahb/spack-stage/spack-stage-gcc-12.2.0-7cu3qahzhsxpauy4jlnsbcqmlbkxbbbo/spack-src/gcc/function.cc:2535
0x92d9a8 assign_parms
        /tmp/leahb/spack-stage/spack-stage-gcc-12.2.0-7cu3qahzhsxpauy4jlnsbcqmlbkxbbbo/spack-src/gcc/function.cc:3673
0x930677 expand_function_start(tree_node*)
        /tmp/leahb/spack-stage/spack-stage-gcc-12.2.0-7cu3qahzhsxpauy4jlnsbcqmlbkxbbbo/spack-src/gcc/function.cc:5161
0x7dabf1 execute
        /tmp/leahb/spack-stage/spack-stage-gcc-12.2.0-7cu3qahzhsxpauy4jlnsbcqmlbkxbbbo/spack-src/gcc/cfgexpand.cc:6690
Please submit a full bug report, with preprocessed source (by using -freport-bug).
Please include the complete backtrace with any bug report.
See <https://github.com/spack/spack/issues> for instructions.
make[3]: *** [Makefile:944: stream_gribapi.lo] Error 1
make[3]: *** Waiting for unfinished jobs....
mv -f .deps/gribapi_utilities.Tpo .deps/gribapi_utilities.Plo
make[3]: Leaving directory '/work/noaa/epic/role-epic/spack-stack/hercules/spack-stack-1.5.1/cache/build_stage/spack-stage-cdo-2.0.5-hxibemg5qlkizhtzsez2apwjtmtjyvfk/spack-src/libcdi/src'
make[2]: *** [Makefile:713: all] Error 2
make[2]: Leaving directory '/work/noaa/epic/role-epic/spack-stack/hercules/spack-stack-1.5.1/cache/build_stage/spack-stage-cdo-2.0.5-hxibemg5qlkizhtzsez2apwjtmtjyvfk/spack-src/libcdi/src'
make[1]: *** [Makefile:535: all-recursive] Error 1
make[1]: Leaving directory '/work/noaa/epic/role-epic/spack-stack/hercules/spack-stack-1.5.1/cache/build_stage/spack-stage-cdo-2.0.5-hxibemg5qlkizhtzsez2apwjtmtjyvfk/spack-src/libcdi'
make: *** [Makefile:492: all-recursive] Error 1
==> Error: ProcessError: Command exited with status 2:
    'make' '-j6' 'V=1'

5 errors found in build log:
     1024    mv -f .deps/zaxis.Tpo .deps/zaxis.Plo
     1025    libtool: link: ar cru .libs/libcdiresunpack.a .libs/resource_unpack.o
     1026    libtool: link: ranlib .libs/libcdiresunpack.a
     1027    libtool: link: ( cd ".libs" && rm -f "libcdiresunpack.la" && ln -s "../libcdiresunpack.la" "libcdiresunpack.la" )
     1028    during RTL pass: expand
     1029    stream_gribapi.c: In function 'gribapiVarCompare.isra':
  >> 1030    stream_gribapi.c:852:1: internal compiler error: in classify_argument, at config/i386/i386.cc:2388
     1031      852 | gribapiVarCompare(compvar2_t compVar, record_t record, int flag)
     1032          | ^~~~~~~~~~~~~~~~~
     1033    mv -f .deps/iterator_grib.Tpo .deps/iterator_grib.Plo
     1034    0x67e958 classify_argument
     1035       /tmp/leahb/spack-stage/spack-stage-gcc-12.2.0-7cu3qahzhsxpauy4jlnsbcqmlbkxbbbo/spack-src/gcc/config/i386/i386.cc:2388
     1036    0xef9c56 classify_argument

     ...

     1049       /tmp/leahb/spack-stage/spack-stage-gcc-12.2.0-7cu3qahzhsxpauy4jlnsbcqmlbkxbbbo/spack-src/gcc/function.cc:5161
     1050    0x7dabf1 execute
     1051       /tmp/leahb/spack-stage/spack-stage-gcc-12.2.0-7cu3qahzhsxpauy4jlnsbcqmlbkxbbbo/spack-src/gcc/cfgexpand.cc:6690
     1052    Please submit a full bug report, with preprocessed source (by using -freport-bug).
     1053    Please include the complete backtrace with any bug report.
     1054    See <https://github.com/spack/spack/issues> for instructions.
  >> 1055    make[3]: *** [Makefile:944: stream_gribapi.lo] Error 1
     1056    make[3]: *** Waiting for unfinished jobs....
     1057    mv -f .deps/gribapi_utilities.Tpo .deps/gribapi_utilities.Plo
     1058    make[3]: Leaving directory '/work/noaa/epic/role-epic/spack-stack/hercules/spack-stack-1.5.1/cache/build_stage/spack-stage-cdo-2.0.5-hxibemg5qlkizhtzsez2apwjtmtjyvfk/spack-src/libcdi/src'
  >> 1059    make[2]: *** [Makefile:713: all] Error 2
     1060    make[2]: Leaving directory '/work/noaa/epic/role-epic/spack-stack/hercules/spack-stack-1.5.1/cache/build_stage/spack-stage-cdo-2.0.5-hxibemg5qlkizhtzsez2apwjtmtjyvfk/spack-src/libcdi/src'
  >> 1061    make[1]: *** [Makefile:535: all-recursive] Error 1
     1062    make[1]: Leaving directory '/work/noaa/epic/role-epic/spack-stack/hercules/spack-stack-1.5.1/cache/build_stage/spack-stage-cdo-2.0.5-hxibemg5qlkizhtzsez2apwjtmtjyvfk/spack-src/libcdi'
  >> 1063    make: *** [Makefile:492: all-recursive] Error 1
climbfuji commented 7 months ago

@junwang-noaa Do I need to use a branch for testing with esmf 8.5.0, mapl 2.40.3, fms 2023.02.01 or will develop work, do you know?

climbfuji commented 7 months ago

I built ufs-weather-model develop on Hercules against new spack-stack@1.5.1 with gcc@12.2.0, openmpi@4.1.4, fms@2023.02.01, esmf@8.5.0, mapl@2.40.3. I then tried to run cpld_control_p8 and it segfaulted. The modified ufs-weather-model code is in

/work2/noaa/jcsda/dheinzel/ufs-wm-151

and the run directory is

/work2/noaa/stmp/dheinzel/stmp/dheinzel/FV3_RT/rt_2555689/cpld_control_p8_gnu

The rt.sh command was

./rt.sh -n cpld_control_p8 gnu -e -a gsd-hpcs -c -k 2>&1 | tee log.rt_cpld_control_p8
junwang-noaa commented 7 months ago

It failed in MOM

DusanJovic-NOAA commented 7 months ago

The error is:

160: 
160: Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
160: 
160: Backtrace for this error:
160: #0  0x14b26e688d8f in ???
160: #1  0x14b26e6fe64d in ???
160: #2  0x3d2fb69 in ???
160: #3  0x3ba012d in ???
160: #4  0x3334fa2 in __mom_io_infra_MOD_read_field_2d
160:    at /work2/noaa/jcsda/dheinzel/ufs-wm-151/MOM6-interface/MOM6/config_src/infra/FMS2/MOM_io_infra.F90:905
160: #5  0x303950d in __mom_io_MOD_mom_read_data_2d
160:    at /work2/noaa/jcsda/dheinzel/ufs-wm-151/MOM6-interface/MOM6/src/framework/MOM_io.F90:2172

Line 905 in MOM_io_infra.F90 is:

 905     call fms2_read_data(fileobj, var_to_read, data)