Closed DusanJovic-NOAA closed 2 months ago
If mpi_f08 can fix the issue of type mismatch, I think we can add mpi_f08 module as a requirement for ufs-weather-model to support gnu compiler. @jkbk2004 would you please check if we can switch to OpenMP from SGI MPT on Cheyenne with gnu compiler and reinstall hpc-stack? Thanks
@junwang-noaa I will give a try to install openmpi/gnu and see if its doable.
@DusanJovic-NOAA @junwang-noaa @kgerheiser I captured this and related information in the spack-stack repo: https://github.com/NOAA-EMC/spack-stack/issues/109
We will try to rebuild spack-stack on the various platforms using gcc
, (LLVM) clang
or apple-clang
with open-mpi
. If that all works, then it would good to merge the changes in this ufs-weather-model PR when the switch to spack-stack is made.
@climbfuji Have you tried rebuilding spacl_stack with gcc and open-mpi? Does that work?
@climbfuji Have you tried rebuilding spacl_stack with gcc and open-mpi? Does that work?
Your question is timely, as I was about to post an update here.
There are some issues with using openmpi on macOS that I started investigating last week. The problem has to do with flat namespaces versus two-level namespaces. mpich supports two-level namespaces through a configure
option, openmpi does not. I am trying to achieve the same by setting appropriate linker flags when building openmpi and when building apps with openmpi.
Without two-level namespaces, there is a problem with mixing the libc++
from macOS (part of the native clang, and similar for LLVM clang) and libstdc++
from gcc
, that results in exceptions not being caught correctly etc. I hope to have a good answer in the first week of May for whether switching to openmpi is a viable solution on macOS or not.
@junwang-noaa @DusanJovic-NOAA Good news. I was able to compile openmpi
such that it mimics the two-level namespace option of mpich
on macOS. We can therefore switch to openmpi
for supporting mpi_f08
.
Still waiting on gnu/openmpi hpc-stack or spack-stack to be installed on Hera and Cheyenne.
Still waiting on gnu/openmpi hpc-stack or spack-stack to be installed on Hera and Cheyenne.
I have a gnu-openmpi spack-stack that can be used for testing on Cheyenne, but it's probably not the final location and the responsibility for Cheyenne should probably be with EPIC, not JCSDA (but it's fine to do this in the transition period).
Still waiting on gnu/openmpi hpc-stack or spack-stack to be installed on Hera and Cheyenne.
On epic side, a help desk ticket was issued to install gnu10.1 on hera. They are working on. Maybe some uodate might be available early next week. Regarding the spack-stack, there are some issues on compiling stage: git discussion #1346 Can you take a look? Sounds like we need more discussion about the issue.
On epic side, a help desk ticket was issued to install gnu10.1 on hera. They are working on. Maybe some uodate might be available early next week. Regarding the spack-stack, there are some issues on compiling stage: git discussion #1346 Can you take a look? Sounds like we need more discussion about the issue.
I do not have access to Frontera. Is spack-stack installation available on Hera?
I thought the problems on frontera were sorted out, according to the discussion. @Hang-Lei-NOAA and I will install spack-stack 1.1.0 on hera in the next few days, hopefully we can use this as a basis for the migration of the UFS to spack-stack.
Is there any progress in installing gnu/openmpi hpc-stack or spack-stack on Hera and Cheyenne?
GNU hpc-stack was there on the official hpc-stack installation.
There is a spack-stack installation on
module use /scratch1/NCEPDEV/jcsda/jedipara/spack-stack/modulefiles module load miniconda/3.9.12 module load ecflow/5.5.3
On Mon, Oct 17, 2022 at 2:23 PM Dusan Jovic @.***> wrote:
Is there any progress in installing gnu/openmpi hpc-stack or spack-stack on Hera and Cheyenne?
— Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-weather-model/pull/1147#issuecomment-1281295661, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKWSMFBHJW4KABW4TB7OF53WDWKR7ANCNFSM5R6RO57Q . You are receiving this because you were mentioned.Message ID: @.***>
The full answer is here: https://spack-stack.readthedocs.io/en/latest/Platforms.html#noaa-rdhpcs-hera
Note that there are also two other build environments that match the hpc-stack version of the libraries in /scratch1/NCEPDEV/global/spack-stack/spack-stack-v1/envs/hpc-stack-dev-gnu-9.2.0
and /scratch1/NCEPDEV/global/spack-stack/spack-stack-v1/envs/hpc-stack-dev-intel-2022.0.2/
.
Note also that @mark-a-potts and @AlexanderRichert-NOAA are testing the ufs-weather-model with these software stacks.
GNU hpc-stack was there on the official hpc-stack installation.
@Hang-Lei-NOAA Do you mean the installation here /scratch2/NCEPDEV/nwprod/hpc-stack/libs/hpc-stack/gnu-9.2.0
?
I see only mpich not openmpi library.
The full answer is here: https://spack-stack.readthedocs.io/en/latest/Platforms.html#noaa-rdhpcs-hera
Note that there are also two other build environments that match the hpc-stack version of the libraries in
/scratch1/NCEPDEV/global/spack-stack/spack-stack-v1/envs/hpc-stack-dev-gnu-9.2.0
and/scratch1/NCEPDEV/global/spack-stack/spack-stack-v1/envs/hpc-stack-dev-intel-2022.0.2/
.Note also that @mark-a-potts and @AlexanderRichert-NOAA are testing the ufs-weather-model with these software stacks.
I tried to build and run the control regression test using /scratch2/NCEPDEV/nwprod/hpc-stack/libs/hpc-stack/modulefiles/stack
. I addition to using this stack and loading stack-gcc/9.2.0
and stack-openmpi/3.1.4
I had to rename two modules in ufs_common (netcdf to netcdf-c and pio to parallelio) and remove one (gftl-shared) which apparently is missing.
Model then compiled successfully, but unfortunately it crashes with this error:
85: /scratch1/NCEPDEV/stmp2/Dusan.Jovic/FV3_RT/rt_40700/control/./fv3.exe: /usr/lib64/libz.so.1: version `ZLIB_1.2.9' not found (required by /scratch1/NCEPDEV/global/spack-stack/spack-stack-v1/envs/hpc-stack-dev-gnu-9.2.0/install/gcc/9.2.0/libpng-1.6.37-7jlo63z/lib/libpng16.so.16)
We do not have openmpi versions installed. Could you please create a ticket in hpc-stack, if it is required on hera. I can start to develop these sets of libs.
On Mon, Oct 17, 2022 at 4:11 PM Dusan Jovic @.***> wrote:
The full answer is here: https://spack-stack.readthedocs.io/en/latest/Platforms.html#noaa-rdhpcs-hera
Note that there are also two other build environments that match the hpc-stack version of the libraries in /scratch1/NCEPDEV/global/spack-stack/spack-stack-v1/envs/hpc-stack-dev-gnu-9.2.0 and /scratch1/NCEPDEV/global/spack-stack/spack-stack-v1/envs/hpc-stack-dev-intel-2022.0.2/ .
Note also that @mark-a-potts https://github.com/mark-a-potts and @AlexanderRichert-NOAA https://github.com/AlexanderRichert-NOAA are testing the ufs-weather-model with these software stacks.
I tried to build and run the control regression test using /scratch2/NCEPDEV/nwprod/hpc-stack/libs/hpc-stack/modulefiles/stack. I addition to using this stack and loading stack-gcc/9.2.0 and stack-openmpi/3.1.4 I had to rename two modules in ufs_common (netcdf to netcdf-c and pio to parallelio) and remove one (gftl-shared) which apparently is missing. Model then compiled successfully, but unfortunately it crashes with this error:
85: /scratch1/NCEPDEV/stmp2/Dusan.Jovic/FV3_RT/rt_40700/control/./fv3.exe: /usr/lib64/libz.so.1: version `ZLIB_1.2.9' not found (required by /scratch1/NCEPDEV/global/spack-stack/spack-stack-v1/envs/hpc-stack-dev-gnu-9.2.0/install/gcc/9.2.0/libpng-1.6.37-7jlo63z/lib/libpng16.so.16)
— Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-weather-model/pull/1147#issuecomment-1281429465, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKWSMFE7ZBWPL6MOPXWLK2TWDWXFNANCNFSM5R6RO57Q . You are receiving this because you were mentioned.Message ID: @.***>
I also tried this stack /glade/work/jedipara/cheyenne/spack-stack/spack-stack-v1/envs/skylab-2.0.0-intel-19.1.1.217/install/modulefiles/Core
, as described in redathedocs link in the above comment. In order to use this stack I had to update versions of several libraries (zlib, jasper, hdf5, netcdf) in addition to renaming two modules listed above. But then one module is missing w3emc/2.9.2
It will be difficult to transition to spack-stack from hpc-stack with so many inconsistencies between the two, in module names, module versions, the content of the stacks etc.
So in this PR I'm going to focus to get hpc-stack working. If possible.
That’s why we created the “other” hpc-stack-dev environments (in addition to the skylab-2.0.0 envs) on hera for transitioning ...
On Oct 17, 2022, at 2:46 PM, Dusan Jovic @.***> wrote:
I also tried this stack /glade/work/jedipara/cheyenne/spack-stack/spack-stack-v1/envs/skylab-2.0.0-intel-19.1.1.217/install/modulefiles/Core, as described in redathedocs link in the above comment. In order to use this stack I had to update versions of several libraries (zlib, jasper, hdf5, netcdf) in addition to renaming two modules listed above. But then one module is missing w3emc/2.9.2
It will be difficult to transition to spack-stack from hpc-stack with so many inconsistencies between the two, in module names, module versions, the content of the stacks etc.
So in this PR I'm going to focus to get hpc-stack working. If possible.
— Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-weather-model/pull/1147#issuecomment-1281474186, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB5C2RKQHCY745LDFTQS7LDWDW3JNANCNFSM5R6RO57Q. You are receiving this because you were mentioned.
@BrianCurtis-NOAA tried to run this branch on Cheyenne using gnu compiler and mpt, and it failed to compile. I suspect we'll need to switch from mpt to openmpi.
@jkbk2004 Can somebody from EPIC build gnu/openmpi stack on cheyenne. Thanks.
@BrianCurtis-NOAA tried to run this branch on Cheyenne using gnu compiler and mpt, and it failed to compile. I suspect we'll need to switch from mpt to openmpi.
@jkbk2004 Can somebody from EPIC build gnu/openmpi stack on cheyenne. Thanks.
It's really a pity that we didn't coordinate this PR with switching to spack-stack - spack-stack is on Cheyenne and should be able to support the ufs-weather-model. It uses Intel + Intel-MPI and GNU + OpenMPI.
I do not have access to cheyenne, so I can not run any tests. But it looks like the gnu/openmpi hpc-stack on hera is finally ready (see https://github.com/ufs-community/ufs-weather-model/issues/1465), so I do not see why we can not switch to gnu/openmpi hpc-stack on Hera and gnu/openmpi spack-stack on Cheyenne and finally get this PR merged.
Command:
CMAKE_FLAGS="-DAPP=S2SWA -DCCPP_SUITES=FV3_GFS_v17_coupled_p8,FV3_GFS_cpld_rasmgshocnsstnoahmp_ugwp" ./build.sh
module list
:
Currently Loaded Modules:
1) cmake/3.22.0 7) ncarcompilers/0.5.0 13) libpng/1.6.37 19) bacio/2.4.1 25) w3emc/2.9.2
2) miniconda3/4.12.0 8) hpc/1.2.0 14) hdf5/1.10.6 20) crtm/2.4.0 26) gftl-shared/v1.5.0
3) python/3.7.9 9) hpc-gnu/11.2.0 15) netcdf/4.7.4 21) g2/3.4.5 27) mapl/2.22.0-esmf-8.3.0b09
4) ncarenv/1.3 10) hpc-mpt/2.25 16) pio/2.5.7 22) g2tmpl/1.10.0 28) ufs_common
5) gnu/11.2.0 11) jasper/2.0.25 17) esmf/8.3.0b09 23) ip/3.3.3 29) ufs_cheyenne.gnu
6) mpt/2.25 12) zlib/1.2.11 18) fms/2022.01 24) sp/2.3.3
Output at linking stage:
[100%] Linking Fortran executable ufs_model
/usr/bin/ld: /usr/bin/ld: DWARF error: can't find .debug_ranges section.
FV3/libfv3atm.a(module_fcst_grid_comp.F90.o): in function `__module_fcst_grid_comp_MOD_fcst_initialize':
module_fcst_grid_comp.F90:(.text+0x2711): undefined reference to `sgi_mpi_f08_integer'
/usr/bin/ld: module_fcst_grid_comp.F90:(.text+0x277d): undefined reference to `mpi_allgather_f08ts_'
/usr/bin/ld: module_fcst_grid_comp.F90:(.text+0x27c0): undefined reference to `sgi_mpi_f08_logical'
/usr/bin/ld: module_fcst_grid_comp.F90:(.text+0x27e1): undefined reference to `mpi_allgather_f08ts_'
/usr/bin/ld: module_fcst_grid_comp.F90:(.text+0x4675): undefined reference to `sgi_mpi_f08_logical'
/usr/bin/ld: module_fcst_grid_comp.F90:(.text+0x46b6): undefined reference to `mpi_bcast_f08ts_'
/usr/bin/ld: /usr/bin/ld: DWARF error: can't find .debug_ranges section.
FV3/libfv3atm.a(module_wrt_grid_comp.F90.o): in function `__module_wrt_grid_comp_MOD_wrt_run':
module_wrt_grid_comp.F90:(.text+0x126b4): undefined reference to `sgi_mpi_f08_integer'
/usr/bin/ld: module_wrt_grid_comp.F90:(.text+0x126c1): undefined reference to `mpi_bcast_f08ts_'
/usr/bin/ld: module_wrt_grid_comp.F90:(.text+0x12741): undefined reference to `sgi_mpi_f08_integer'
/usr/bin/ld: module_wrt_grid_comp.F90:(.text+0x1274e): undefined reference to `mpi_bcast_f08ts_'
/usr/bin/ld: module_wrt_grid_comp.F90:(.text+0x127ce): undefined reference to `sgi_mpi_f08_integer'
/usr/bin/ld: module_wrt_grid_comp.F90:(.text+0x127db): undefined reference to `mpi_bcast_f08ts_'
/usr/bin/ld: module_wrt_grid_comp.F90:(.text+0x1285b): undefined reference to `sgi_mpi_f08_integer'
/usr/bin/ld: module_wrt_grid_comp.F90:(.text+0x12868): undefined reference to `mpi_bcast_f08ts_'
/usr/bin/ld: module_wrt_grid_comp.F90:(.text+0x12c52): undefined reference to `sgi_mpi_f08_integer'
/usr/bin/ld: module_wrt_grid_comp.F90:(.text+0x12c5f): undefined reference to `mpi_bcast_f08ts_'
/usr/bin/ld: FV3/libfv3atm.a(module_wrt_grid_comp.F90.o): in function `__module_wrt_grid_comp_MOD_wrt_initialize_p1':
module_wrt_grid_comp.F90:(.text+0x13f9b): undefined reference to `mpi_comm_dup_f08_'
/usr/bin/ld: module_wrt_grid_comp.F90:(.text+0x172f3): undefined reference to `sgi_mpi_f08_integer'
/usr/bin/ld: module_wrt_grid_comp.F90:(.text+0x17314): undefined reference to `mpi_allgather_f08ts_'
/usr/bin/ld: module_wrt_grid_comp.F90:(.text+0x1737b): undefined reference to `sgi_mpi_f08_integer'
/usr/bin/ld: module_wrt_grid_comp.F90:(.text+0x1739c): undefined reference to `mpi_allgather_f08ts_'
/usr/bin/ld: module_wrt_grid_comp.F90:(.text+0x17403): undefined reference to `sgi_mpi_f08_integer'
/usr/bin/ld: module_wrt_grid_comp.F90:(.text+0x17424): undefined reference to `mpi_allgather_f08ts_'
/usr/bin/ld: module_wrt_grid_comp.F90:(.text+0x17489): undefined reference to `sgi_mpi_f08_integer'
/usr/bin/ld: module_wrt_grid_comp.F90:(.text+0x174aa): undefined reference to `mpi_allgather_f08ts_'
/usr/bin/ld: FV3/libfv3atm.a(post_fv3.F90.o): in function `__post_fv3_MOD_post_run_fv3':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/FV3/io/post_fv3.F90:209: undefined reference to `mpi_barrier_f08_'
/usr/bin/ld: /usr/bin/ld: DWARF error: can't find .debug_ranges section.
FV3/libfv3atm.a(atmos_model.F90.o): in function `__atmos_model_mod_MOD_update_atmos_radiation_physics':
atmos_model.F90:(.text+0x2695d): undefined reference to `sgi_mpi_f08_sum'
/usr/bin/ld: atmos_model.F90:(.text+0x2696a): undefined reference to `sgi_mpi_f08_double_precision'
/usr/bin/ld: atmos_model.F90:(.text+0x26a27): undefined reference to `mpi_allreduce_f08ts_'
/usr/bin/ld: atmos_model.F90:(.text+0x26a42): undefined reference to `sgi_mpi_f08_maxloc'
/usr/bin/ld: atmos_model.F90:(.text+0x26a5a): undefined reference to `sgi_mpi_f08_2double_precision'
/usr/bin/ld: atmos_model.F90:(.text+0x26b53): undefined reference to `mpi_allreduce_f08ts_'
/usr/bin/ld: atmos_model.F90:(.text+0x26bc4): undefined reference to `sgi_mpi_f08_double_precision'
/usr/bin/ld: atmos_model.F90:(.text+0x26bde): undefined reference to `mpi_bcast_f08ts_'
/usr/bin/ld: /usr/bin/ld: DWARF error: can't find .debug_ranges section.
FV3/libfv3atm.a(module_write_netcdf.F90.o): in function `__module_write_netcdf_MOD_write_netcdf':
module_write_netcdf.F90:(.text+0x51e3): undefined reference to `sgi_mpi_f08_info_null'
/usr/bin/ld: module_write_netcdf.F90:(.text+0x8f6c): undefined reference to `sgi_mpi_f08_max'
/usr/bin/ld: module_write_netcdf.F90:(.text+0x8f7d): undefined reference to `sgi_mpi_f08_real4'
/usr/bin/ld: module_write_netcdf.F90:(.text+0x9005): undefined reference to `mpi_allreduce_f08ts_'
/usr/bin/ld: module_write_netcdf.F90:(.text+0x9012): undefined reference to `sgi_mpi_f08_real4'
/usr/bin/ld: module_write_netcdf.F90:(.text+0x9028): undefined reference to `sgi_mpi_f08_min'
/usr/bin/ld: module_write_netcdf.F90:(.text+0x90b1): undefined reference to `mpi_allreduce_f08ts_'
/usr/bin/ld: module_write_netcdf.F90:(.text+0x90fb): undefined reference to `sgi_mpi_f08_max'
/usr/bin/ld: module_write_netcdf.F90:(.text+0x9100): undefined reference to `sgi_mpi_f08_real4'
/usr/bin/ld: module_write_netcdf.F90:(.text+0x918e): undefined reference to `mpi_allreduce_f08ts_'
/usr/bin/ld: /usr/bin/ld: DWARF error: can't find .debug_ranges section.
FV3/ccpp/physics/libccpp_physics.a(module_mp_thompson.F90.o): in function `__module_mp_thompson_MOD_freezeh2o':
module_mp_thompson.F90:(.text+0x5d97): undefined reference to `mpi_barrier_f08_'
/usr/bin/ld: FV3/ccpp/physics/libccpp_physics.a(module_mp_thompson.F90.o): in function `__module_mp_thompson_MOD_qr_acr_qs':
module_mp_thompson.F90:(.text+0x6737): undefined reference to `mpi_barrier_f08_'
/usr/bin/ld: FV3/ccpp/physics/libccpp_physics.a(module_mp_thompson.F90.o): in function `__module_mp_thompson_MOD_qr_acr_qg':
module_mp_thompson.F90:(.text+0x87d7): undefined reference to `mpi_barrier_f08_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_alltoall_r4_1darr':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:710: undefined reference to `sgi_mpi_f08_real'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:710: undefined reference to `sgi_mpi_f08_real'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:710: undefined reference to `mpi_alltoallv_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_reduce_sum_i8':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:685: undefined reference to `sgi_mpi_f08_sum'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:685: undefined reference to `sgi_mpi_f08_integer8'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:685: undefined reference to `mpi_allreduce_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_reduce_sum_i':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:660: undefined reference to `sgi_mpi_f08_sum'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:660: undefined reference to `sgi_mpi_f08_integer'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:660: undefined reference to `mpi_allreduce_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_reduce_sum_r8_2darr':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:640: undefined reference to `sgi_mpi_f08_sum'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:640: undefined reference to `sgi_mpi_f08_double_precision'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:640: undefined reference to `mpi_allreduce_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_reduce_sum_r8_1darr':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:616: undefined reference to `sgi_mpi_f08_double_precision'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:616: undefined reference to `sgi_mpi_f08_sum'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:616: undefined reference to `mpi_allreduce_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_reduce_sum_r8_1d':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:594: undefined reference to `sgi_mpi_f08_sum'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:594: undefined reference to `sgi_mpi_f08_double_precision'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:594: undefined reference to `mpi_allreduce_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_reduce_sum_r4_1d':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:566: undefined reference to `sgi_mpi_f08_sum'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:566: undefined reference to `sgi_mpi_f08_double_precision'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:566: undefined reference to `mpi_allreduce_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_reduce_sum_r4_2darr':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:537: undefined reference to `sgi_mpi_f08_sum'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:537: undefined reference to `sgi_mpi_f08_real'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:537: undefined reference to `mpi_allreduce_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_reduce_sum_r4_1darr':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:514: undefined reference to `sgi_mpi_f08_real'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:514: undefined reference to `sgi_mpi_f08_sum'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:514: undefined reference to `mpi_allreduce_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_reduce_sum_r8':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:491: undefined reference to `sgi_mpi_f08_sum'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:491: undefined reference to `sgi_mpi_f08_double_precision'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:491: undefined reference to `mpi_allreduce_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_reduce_sum_r4':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:471: undefined reference to `sgi_mpi_f08_sum'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:471: undefined reference to `sgi_mpi_f08_real'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:471: undefined reference to `mpi_allreduce_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_reduce_max_i':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:451: undefined reference to `sgi_mpi_f08_max'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:451: undefined reference to `sgi_mpi_f08_integer'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:451: undefined reference to `mpi_allreduce_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_reduce_min_r8':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:431: undefined reference to `sgi_mpi_f08_min'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:431: undefined reference to `sgi_mpi_f08_double_precision'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:431: undefined reference to `mpi_allreduce_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_reduce_min_r4':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:419: undefined reference to `sgi_mpi_f08_min'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:419: undefined reference to `sgi_mpi_f08_real'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:419: undefined reference to `mpi_allreduce_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_reduce_max_r8':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:407: undefined reference to `sgi_mpi_f08_max'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:407: undefined reference to `sgi_mpi_f08_double_precision'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:407: undefined reference to `mpi_allreduce_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_reduce_max_r4':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:390: undefined reference to `sgi_mpi_f08_max'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:390: undefined reference to `sgi_mpi_f08_real'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:390: undefined reference to `mpi_allreduce_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_reduce_max_r8_1d':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:370: undefined reference to `sgi_mpi_f08_max'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:370: undefined reference to `sgi_mpi_f08_double_precision'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:370: undefined reference to `mpi_allreduce_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_reduce_max_r4_1d':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:349: undefined reference to `sgi_mpi_f08_max'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:349: undefined reference to `sgi_mpi_f08_real'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:349: undefined reference to `mpi_allreduce_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_bcst_4d_i':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:330: undefined reference to `sgi_mpi_f08_integer'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:330: undefined reference to `mpi_bcast_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_bcst_2d_i':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:314: undefined reference to `sgi_mpi_f08_integer'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:314: undefined reference to `mpi_bcast_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_bcst_1d_i':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:298: undefined reference to `sgi_mpi_f08_integer'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:298: undefined reference to `mpi_bcast_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_bcst_3d_i':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:282: undefined reference to `sgi_mpi_f08_integer'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:282: undefined reference to `mpi_bcast_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_bcst_4d_r8':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:266: undefined reference to `sgi_mpi_f08_double_precision'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:266: undefined reference to `mpi_bcast_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_bcst_4d_r4':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:250: undefined reference to `sgi_mpi_f08_real'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:250: undefined reference to `mpi_bcast_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_bcst_3d_r8':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:234: undefined reference to `sgi_mpi_f08_double_precision'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:234: undefined reference to `mpi_bcast_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_bcst_3d_r4':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:218: undefined reference to `sgi_mpi_f08_real'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:218: undefined reference to `mpi_bcast_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_bcst_2d_r8':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:202: undefined reference to `sgi_mpi_f08_double_precision'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:202: undefined reference to `mpi_bcast_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_bcst_2d_r4':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:186: undefined reference to `sgi_mpi_f08_real'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:186: undefined reference to `mpi_bcast_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_bcst_1d_r8':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:170: undefined reference to `sgi_mpi_f08_double_precision'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:170: undefined reference to `mpi_bcast_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_bcst_1d_r4':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:154: undefined reference to `sgi_mpi_f08_real'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:154: undefined reference to `mpi_bcast_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_bcst_r8':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:138: undefined reference to `sgi_mpi_f08_double_precision'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:138: undefined reference to `mpi_bcast_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_bcst_r4':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:123: undefined reference to `sgi_mpi_f08_real'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:123: undefined reference to `mpi_bcast_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mp_bcst_i':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:108: undefined reference to `sgi_mpi_f08_integer'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:108: undefined reference to `mpi_bcast_f08ts_'
/usr/bin/ld: stochastic_physics/libstochastic_physics.a(mpi_wrapper.F90.o): in function `__mpi_wrapper_MOD_mpi_wrapper_initialize':
/glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:86: undefined reference to `mpi_comm_rank_f08_'
/usr/bin/ld: /glade/work/briancurtis/git/DusanJovic-NOAA/ufs-weather-model/stochastic_physics/mpi_wrapper.F90:87: undefined reference to `mpi_comm_size_f08_'
collect2: error: ld returned 1 exit status
CMakeFiles/ufs_model.dir/build.make:155: recipe for target 'ufs_model' failed
make[2]: *** [ufs_model] Error 1
CMakeFiles/Makefile2:773: recipe for target 'CMakeFiles/ufs_model.dir/all' failed
make[1]: *** [CMakeFiles/ufs_model.dir/all] Error 2
Makefile:135: recipe for target 'all' failed
make: *** [all] Error 2
I do not have access to cheyenne, so I can not run any tests. But it looks like the gnu/openmpi hpc-stack on hera is finally ready (see #1465), so I do not see why we can not switch to gnu/openmpi hpc-stack on Hera and gnu/openmpi spack-stack on Cheyenne and finally get this PR merged.
Yes, I hear you ... not exactly my job anymore, but I am going to give it a shot on Cheyenne. I will need to extend the existing environment with a few missing packages for the fully-coupled UFS. Then change the build environment for Cheyenne with GNU and see what happens ...
@BrianCurtis-NOAA tried to run this branch on Cheyenne using gnu compiler and mpt, and it failed to compile. I suspect we'll need to switch from mpt to openmpi.
@jkbk2004 Can somebody from EPIC build gnu/openmpi stack on cheyenne. Thanks.
i can install gnu/openmpi in the official space on cheyenne, unless @climbfuji has already done so. @DusanJovic-NOAA : can you confirm which gnu/openmpi you prefer? there is much more flexibility for combinations out-of-the-box on cheyenne.
Any supported gnu compiler version >= 9, and any supported opemnpi. I do not have any preference.
I was able to compile the UFS on Cheyenne using spack-stack with gcc-10.1.0 and openmpi-4.1.1.
I ran rt.sh for the control test first, and it passed against the existing baseline:
...
+ cat /glade/scratch/heinzell/ufs-weather-model-dusan-no-arg-mismatch-spack-stack/tests/log_cheyenne.gnu/compile_001_time.log
+ cat /glade/scratch/heinzell/ufs-weather-model-dusan-no-arg-mismatch-spack-stack/tests/log_cheyenne.gnu/rt_001_control.log
+ FILES='fail_test_* fail_compile_*'
+ for f in '$FILES'
+ [[ -f fail_test_* ]]
+ for f in '$FILES'
+ [[ -f fail_compile_* ]]
+ [[ -e fail_test ]]
+ echo
+ echo REGRESSION TEST WAS SUCCESSFUL
REGRESSION TEST WAS SUCCESSFUL
+ echo
+ echo REGRESSION TEST WAS SUCCESSFUL
+ rm -f 'fv3_*.x' fv3_001.exe modules.fv3
It requires small changes in the ufs-weather-model which I would be happy to contribute to this PR, but we need a clear path forward with the remaining applications and how to manage this spack-stack installation.
@DusanJovic-NOAA Do we need to change the ufs_hera.gnu.lua and ufs_hera.gnu_debug.lua too?
@DusanJovic-NOAA Do we need to change the ufs_hera.gnu.lua and ufs_hera.gnu_debug.lua too?
Both hera and cheyenne gnu module files will eventually need to be changed to use updated gnu and openmpi modules. But I'm not sure if that's ready at this moment. Ideally we should update to the new mpi library in a separate commit so that this PR only updates the code without needing new baselines.
Is there any progress in installing gnu/openmpi libraries on Hera?
Is there any progress in installing gnu/openmpi libraries on Hera?
spack-stack/1.3.1 is installed on hera and features gnu/9.2.0
and openmpi/4.1.5
. it will hopefully be available in WM develop in the near future (see #1707). you can test it with this modulefile (or just take the paths), if you want: https://github.com/ulmononian/ufs-weather-model/blob/feature/spack_stack_ue/modulefiles/ufs_hera.gnu.lua. note, however, that you'll also need to use this updated ufs_common https://github.com/ulmononian/ufs-weather-model/blob/feature/spack_stack_ue/modulefiles/ufs_common.lua.
Is there any progress in installing gnu/openmpi libraries on Hera?
spack-stack/1.3.1 is installed on hera and features
gnu/9.2.0
andopenmpi/4.1.5
. it will hopefully be available in WM develop in the near future (see #1707). you can test it with this modulefile (or just take the paths), if you want: https://github.com/ulmononian/ufs-weather-model/blob/feature/spack_stack_ue/modulefiles/ufs_hera.gnu.lua. note, however, that you'll also need to use this updated ufs_common https://github.com/ulmononian/ufs-weather-model/blob/feature/spack_stack_ue/modulefiles/ufs_common.lua.
Thanks. I tested this branch using the modulefiles from your PR on Hera, and all tests compiled successfully and finished without failure. But I see several tests failed output comparison against the current baselines which I think is expected.
$ cat fail_test control_stochy 002 failed in check_result control_stochy 002 failed in run_test control_ras 003 failed in check_result control_ras 003 failed in run_test control_flake 005 failed in check_result control_flake 005 failed in run_test control_diag_debug 023 failed in check_result control_diag_debug 023 failed in run_test rap_noah_sfcdiff_cires_ugwp_debug 028 failed in check_result rap_noah_sfcdiff_cires_ugwp_debug 028 failed in run_test control_ras_debug 031 failed in check_result control_ras_debug 031 failed in run_test control_stochy_debug 032 failed in check_result control_stochy_debug 032 failed in run_test cpld_control_p8 051 failed in check_result cpld_control_p8 051 failed in run_test cpld_debug_p8 053 failed in check_result cpld_debug_p8 053 failed in run_test
@DusanJovic-NOAA Can you provide an update on this?
Full regression test passed on Hera. RegressionTests_hera.log
Yay. This has been a long time coming!
@DusanJovic-NOAA We see a progress to set gnu baseline on hercules with #1733. I wonder if we can combine gnu type mismatch build option there? Please, feel free to leave comments in #1733.
@DusanJovic-NOAA We see a progress to set gnu baseline on hercules with #1733. I wonder if we can combine gnu type mismatch build option there? Please, feel free to leave comments in #1733.
I tried running one gnu regression test on Hercules using a branch form #1733 :
./rt.sh -n control_p8 gnu
but it looks like the regression test on Hercules is still not setup correctly:
+ echo 'control_p8 does not exist or cannot be run on hercules'
control_p8 does not exist or cannot be run on hercules
+ exit 1
+ echo 'rt.sh finished'
rt.sh finished
@DusanJovic-NOAA We see a progress to set gnu baseline on hercules with #1733. I wonder if we can combine gnu type mismatch build option there? Please, feel free to leave comments in #1733.
I tried running one gnu regression test on Hercules using a branch form #1733 :
./rt.sh -n control_p8 gnu
but it looks like the regression test on Hercules is still not setup correctly:
+ echo 'control_p8 does not exist or cannot be run on hercules' control_p8 does not exist or cannot be run on hercules + exit 1 + echo 'rt.sh finished' rt.sh finished
sorry about that @DusanJovic-NOAA -- i forgot to commit the updated rt.conf
. please try again now.
I pulled the changes from #1733, and I successfully compiled all tests however several tests failed to reproduce current baselines. After I created new baselines all tests succeeded.
Somebody will need to run this PR on Cheyenne to make sure gnu compiler/mpi can build all configurations correctly.
When I run the regression test using this branch on Hercules with the GNU compiler it fails in control_p8
test with the following error:
140: At line 831 of file /work/noaa/fv3-cam/djovic/ufs/no_arg_mismatch/ufs-weather-model/FV3/module_fcst_grid_comp.F90
140: Fortran runtime error: Index '144' of dimension 1 of array 'grid_number_on_all_pets' above upper bound of 143
140:
140: Error termination. Backtrace:
140: #0 0x14c76f16c860 in ???
140: #1 0x14c76f16d3b9 in ???
140: #2 0x14c76f16da2d in ???
140: #3 0x11f8085 in fcst_initialize
140: at /work/noaa/fv3-cam/djovic/ufs/no_arg_mismatch/ufs-weather-model/FV3/module_fcst_grid_comp.F90:831
140: #4 0x988fd4 in ???
140: #5 0x989348 in ???
This should not happen. I found that the lower and upper bound of the array grid_number_on_all_pets is changed after the mpi_allgather call.
Consider the following test program:
$ cat mpi_allgather_test.f90
program mpi_allgather_test
use mpi_f08
implicit none
integer :: ierr
character(len=MPI_MAX_LIBRARY_VERSION_STRING) :: version
integer :: resultlen
integer :: mype, nproc
integer, allocatable :: arr(:)
call MPI_Init(ierr)
call MPI_Comm_size(MPI_COMM_WORLD, nproc, ierr)
call MPI_Comm_rank(MPI_COMM_WORLD, mype, ierr)
call MPI_Get_library_version(version, resultlen, ierr)
if (mype == 0) write(*,'(A)') version(1:resultlen)
allocate (arr(nproc))
if(mype==0) write(*,*)'nproc', nproc, ' size=', size(arr), ' lbound=', lbound(arr), ' ubound=', ubound(arr)
arr = 0
call mpi_allgather(mype, 1, MPI_INTEGER, arr, 1, MPI_INTEGER, MPI_COMM_WORLD, ierr)
if(mype==0) write(*,*)'nproc', nproc, ' size=', size(arr), ' lbound=', lbound(arr), ' ubound=', ubound(arr)
call MPI_Finalize(ierr)
end program mpi_allgather_test
Here array arr
is allocate to have the size equal to the number of mpi tasks. Lower bound should be 1 and upper bound should be nproc.
When I compile this program using mvapich2/2.3.7
I get:
$ module purge
$ module use /work/noaa/epic/role-epic/spack-stack/hercules/modulefiles
$ module load mvapich2/2.3.7
$ ml
Currently Loaded Modules:
1) slurm/22.05.8 2) mvapich2/2.3.7
$ mpif90 mpi_allgather_test.f90
$ srun -n 4 ./a.out
MVAPICH2 Version : 2.3.7
MVAPICH2 Release date : Wed March 02 22:00:00 EST 2022
MVAPICH2 Device : ch3:mrail
MVAPICH2 configure : --prefix=/work/noaa/epic/role-epic/spack-stack/hercules/mvapich2-2.3.7/gcc-11.3.1 --with-pmi=pmi2 --with-pm=slurm --with-slurm-include=/opt/slurm-22.05.8/include --with-slurm-lib=/opt/slurm-22.05.8/lib
MVAPICH2 CC : gcc -DNDEBUG -DNVALGRIND -O2
MVAPICH2 CXX : g++ -DNDEBUG -DNVALGRIND -O2
MVAPICH2 F77 : gfortran -fallow-argument-mismatch -O2
MVAPICH2 FC : gfortran -O2
nproc 4 size= 4 lbound= 1 ubound= 4
nproc 4 size= 4 lbound= 0 ubound= 3
which is clearly incorrect, the lower and upper bounds of the arr should be 1 and 4 after the mpi_allgather call.
Using openmpi/4.1.4
and gcc/12.2.0
I see correct lower and upper bounds before and after mpi_allgather call:
$ module purge
$ module load gcc/12.2.0
$ module load openmpi/4.1.4
$ ml
Currently Loaded Modules:
1) zlib/1.2.13 2) gcc/12.2.0 3) openmpi/4.1.4
$ mpif90 mpi_allgather_test.f90
$ srun -n 4 ./a.out
Open MPI v4.1.4, package: Open MPI jhrogers@hercules-devel-1.hpc.msstate.edu Distribution, ident: 4.1.4, repo rev: v4.1.4, May 26, 2022
nproc 4 size= 4 lbound= 1 ubound= 4
nproc 4 size= 4 lbound= 1 ubound= 4
When I run the regression test using this branch on Hercules with the GNU compiler it fails in
control_p8
test with the following error:140: At line 831 of file /work/noaa/fv3-cam/djovic/ufs/no_arg_mismatch/ufs-weather-model/FV3/module_fcst_grid_comp.F90 140: Fortran runtime error: Index '144' of dimension 1 of array 'grid_number_on_all_pets' above upper bound of 143 140: 140: Error termination. Backtrace: 140: #0 0x14c76f16c860 in ??? 140: #1 0x14c76f16d3b9 in ??? 140: #2 0x14c76f16da2d in ??? 140: #3 0x11f8085 in fcst_initialize 140: at /work/noaa/fv3-cam/djovic/ufs/no_arg_mismatch/ufs-weather-model/FV3/module_fcst_grid_comp.F90:831 140: #4 0x988fd4 in ??? 140: #5 0x989348 in ???
This should not happen. I found that the lower and upper bound of the array grid_number_on_all_pets is changed after the mpi_allgather call.
Consider the following test program:
$ cat mpi_allgather_test.f90 program mpi_allgather_test use mpi_f08 implicit none integer :: ierr character(len=MPI_MAX_LIBRARY_VERSION_STRING) :: version integer :: resultlen integer :: mype, nproc integer, allocatable :: arr(:) call MPI_Init(ierr) call MPI_Comm_size(MPI_COMM_WORLD, nproc, ierr) call MPI_Comm_rank(MPI_COMM_WORLD, mype, ierr) call MPI_Get_library_version(version, resultlen, ierr) if (mype == 0) write(*,'(A)') version(1:resultlen) allocate (arr(nproc)) if(mype==0) write(*,*)'nproc', nproc, ' size=', size(arr), ' lbound=', lbound(arr), ' ubound=', ubound(arr) arr = 0 call mpi_allgather(mype, 1, MPI_INTEGER, arr, 1, MPI_INTEGER, MPI_COMM_WORLD, ierr) if(mype==0) write(*,*)'nproc', nproc, ' size=', size(arr), ' lbound=', lbound(arr), ' ubound=', ubound(arr) call MPI_Finalize(ierr) end program mpi_allgather_test
Here array
arr
is allocate to have the size equal to the number of mpi tasks. Lower bound should be 1 and upper bound should be nproc.When I compile this program using
mvapich2/2.3.7
I get:$ module purge $ module use /work/noaa/epic/role-epic/spack-stack/hercules/modulefiles $ module load mvapich2/2.3.7 $ ml Currently Loaded Modules: 1) slurm/22.05.8 2) mvapich2/2.3.7 $ mpif90 mpi_allgather_test.f90 $ srun -n 4 ./a.out MVAPICH2 Version : 2.3.7 MVAPICH2 Release date : Wed March 02 22:00:00 EST 2022 MVAPICH2 Device : ch3:mrail MVAPICH2 configure : --prefix=/work/noaa/epic/role-epic/spack-stack/hercules/mvapich2-2.3.7/gcc-11.3.1 --with-pmi=pmi2 --with-pm=slurm --with-slurm-include=/opt/slurm-22.05.8/include --with-slurm-lib=/opt/slurm-22.05.8/lib MVAPICH2 CC : gcc -DNDEBUG -DNVALGRIND -O2 MVAPICH2 CXX : g++ -DNDEBUG -DNVALGRIND -O2 MVAPICH2 F77 : gfortran -fallow-argument-mismatch -O2 MVAPICH2 FC : gfortran -O2 nproc 4 size= 4 lbound= 1 ubound= 4 nproc 4 size= 4 lbound= 0 ubound= 3
which is clearly incorrect, the lower and upper bounds of the arr should be 1 and 4 after the mpi_allgather call.
Using
openmpi/4.1.4
andgcc/12.2.0
I see correct lower and upper bounds before and after mpi_allgather call:$ module purge $ module load gcc/12.2.0 $ module load openmpi/4.1.4 $ ml Currently Loaded Modules: 1) zlib/1.2.13 2) gcc/12.2.0 3) openmpi/4.1.4 $ mpif90 mpi_allgather_test.f90 $ srun -n 4 ./a.out Open MPI v4.1.4, package: Open MPI jhrogers@hercules-devel-1.hpc.msstate.edu Distribution, ident: 4.1.4, repo rev: v4.1.4, May 26, 2022 nproc 4 size= 4 lbound= 1 ubound= 4 nproc 4 size= 4 lbound= 1 ubound= 4
Pah, this is very annoying. I guess the fastest way forward is to move up from gcc@11 to gcc@12 and use the (hopefully) working openmpi.
@DusanJovic-NOAA @BrianCurtis-NOAA I am building spack-stack@1.5.1 with gcc@12 and openmpi@4.1.4 now. spack-stack 1.5.1 comes with the new ESMF and MAPL, therefore this PR here would have to wait for the esmf-mapl update (move to spack-stack 1.5.1), at least on Hercules. Is that acceptable?
@climbfuji please build with with fms 2023.02.01 too, thank you!
@DusanJovic-NOAA @BrianCurtis-NOAA I am building spack-stack@1.5.1 with gcc@12 and openmpi@4.1.4 now. spack-stack 1.5.1 comes with the new ESMF and MAPL, therefore this PR here would have to wait for the esmf-mapl update (move to spack-stack 1.5.1), at least on Hercules. Is that acceptable?
Sure. Hopefully gcc@12 and openmpi@4.1.4 will finally allow us to move forward with this PR.
fwiw, I am getting a compile error for cdo
(part of the unified environment) on Hercules with gcc@12
(I didn't get that on Derecho with the same compiler). I'll see how to best around this.
stream_gribapi.c: In function 'gribapiVarCompare.isra':
stream_gribapi.c:852:1: internal compiler error: in classify_argument, at config/i386/i386.cc:2388
852 | gribapiVarCompare(compvar2_t compVar, record_t record, int flag)
| ^~~~~~~~~~~~~~~~~
mv -f .deps/iterator_grib.Tpo .deps/iterator_grib.Plo
0x67e958 classify_argument
/tmp/leahb/spack-stage/spack-stage-gcc-12.2.0-7cu3qahzhsxpauy4jlnsbcqmlbkxbbbo/spack-src/gcc/config/i386/i386.cc:2388
0xef9c56 classify_argument
/tmp/leahb/spack-stage/spack-stage-gcc-12.2.0-7cu3qahzhsxpauy4jlnsbcqmlbkxbbbo/spack-src/gcc/config/i386/i386.cc:2528
0xefa63c construct_container
/tmp/leahb/spack-stage/spack-stage-gcc-12.2.0-7cu3qahzhsxpauy4jlnsbcqmlbkxbbbo/spack-src/gcc/config/i386/i386.cc:2625
0xefaee8 function_arg_64
/tmp/leahb/spack-stage/spack-stage-gcc-12.2.0-7cu3qahzhsxpauy4jlnsbcqmlbkxbbbo/spack-src/gcc/config/i386/i386.cc:3283
0xefaee8 ix86_function_arg
/tmp/leahb/spack-stage/spack-stage-gcc-12.2.0-7cu3qahzhsxpauy4jlnsbcqmlbkxbbbo/spack-src/gcc/config/i386/i386.cc:3397
0x92d5fd assign_parm_find_entry_rtl
/tmp/leahb/spack-stage/spack-stage-gcc-12.2.0-7cu3qahzhsxpauy4jlnsbcqmlbkxbbbo/spack-src/gcc/function.cc:2535
0x92d9a8 assign_parms
/tmp/leahb/spack-stage/spack-stage-gcc-12.2.0-7cu3qahzhsxpauy4jlnsbcqmlbkxbbbo/spack-src/gcc/function.cc:3673
0x930677 expand_function_start(tree_node*)
/tmp/leahb/spack-stage/spack-stage-gcc-12.2.0-7cu3qahzhsxpauy4jlnsbcqmlbkxbbbo/spack-src/gcc/function.cc:5161
0x7dabf1 execute
/tmp/leahb/spack-stage/spack-stage-gcc-12.2.0-7cu3qahzhsxpauy4jlnsbcqmlbkxbbbo/spack-src/gcc/cfgexpand.cc:6690
Please submit a full bug report, with preprocessed source (by using -freport-bug).
Please include the complete backtrace with any bug report.
See <https://github.com/spack/spack/issues> for instructions.
make[3]: *** [Makefile:944: stream_gribapi.lo] Error 1
make[3]: *** Waiting for unfinished jobs....
mv -f .deps/gribapi_utilities.Tpo .deps/gribapi_utilities.Plo
make[3]: Leaving directory '/work/noaa/epic/role-epic/spack-stack/hercules/spack-stack-1.5.1/cache/build_stage/spack-stage-cdo-2.0.5-hxibemg5qlkizhtzsez2apwjtmtjyvfk/spack-src/libcdi/src'
make[2]: *** [Makefile:713: all] Error 2
make[2]: Leaving directory '/work/noaa/epic/role-epic/spack-stack/hercules/spack-stack-1.5.1/cache/build_stage/spack-stage-cdo-2.0.5-hxibemg5qlkizhtzsez2apwjtmtjyvfk/spack-src/libcdi/src'
make[1]: *** [Makefile:535: all-recursive] Error 1
make[1]: Leaving directory '/work/noaa/epic/role-epic/spack-stack/hercules/spack-stack-1.5.1/cache/build_stage/spack-stage-cdo-2.0.5-hxibemg5qlkizhtzsez2apwjtmtjyvfk/spack-src/libcdi'
make: *** [Makefile:492: all-recursive] Error 1
==> Error: ProcessError: Command exited with status 2:
'make' '-j6' 'V=1'
5 errors found in build log:
1024 mv -f .deps/zaxis.Tpo .deps/zaxis.Plo
1025 libtool: link: ar cru .libs/libcdiresunpack.a .libs/resource_unpack.o
1026 libtool: link: ranlib .libs/libcdiresunpack.a
1027 libtool: link: ( cd ".libs" && rm -f "libcdiresunpack.la" && ln -s "../libcdiresunpack.la" "libcdiresunpack.la" )
1028 during RTL pass: expand
1029 stream_gribapi.c: In function 'gribapiVarCompare.isra':
>> 1030 stream_gribapi.c:852:1: internal compiler error: in classify_argument, at config/i386/i386.cc:2388
1031 852 | gribapiVarCompare(compvar2_t compVar, record_t record, int flag)
1032 | ^~~~~~~~~~~~~~~~~
1033 mv -f .deps/iterator_grib.Tpo .deps/iterator_grib.Plo
1034 0x67e958 classify_argument
1035 /tmp/leahb/spack-stage/spack-stage-gcc-12.2.0-7cu3qahzhsxpauy4jlnsbcqmlbkxbbbo/spack-src/gcc/config/i386/i386.cc:2388
1036 0xef9c56 classify_argument
...
1049 /tmp/leahb/spack-stage/spack-stage-gcc-12.2.0-7cu3qahzhsxpauy4jlnsbcqmlbkxbbbo/spack-src/gcc/function.cc:5161
1050 0x7dabf1 execute
1051 /tmp/leahb/spack-stage/spack-stage-gcc-12.2.0-7cu3qahzhsxpauy4jlnsbcqmlbkxbbbo/spack-src/gcc/cfgexpand.cc:6690
1052 Please submit a full bug report, with preprocessed source (by using -freport-bug).
1053 Please include the complete backtrace with any bug report.
1054 See <https://github.com/spack/spack/issues> for instructions.
>> 1055 make[3]: *** [Makefile:944: stream_gribapi.lo] Error 1
1056 make[3]: *** Waiting for unfinished jobs....
1057 mv -f .deps/gribapi_utilities.Tpo .deps/gribapi_utilities.Plo
1058 make[3]: Leaving directory '/work/noaa/epic/role-epic/spack-stack/hercules/spack-stack-1.5.1/cache/build_stage/spack-stage-cdo-2.0.5-hxibemg5qlkizhtzsez2apwjtmtjyvfk/spack-src/libcdi/src'
>> 1059 make[2]: *** [Makefile:713: all] Error 2
1060 make[2]: Leaving directory '/work/noaa/epic/role-epic/spack-stack/hercules/spack-stack-1.5.1/cache/build_stage/spack-stage-cdo-2.0.5-hxibemg5qlkizhtzsez2apwjtmtjyvfk/spack-src/libcdi/src'
>> 1061 make[1]: *** [Makefile:535: all-recursive] Error 1
1062 make[1]: Leaving directory '/work/noaa/epic/role-epic/spack-stack/hercules/spack-stack-1.5.1/cache/build_stage/spack-stage-cdo-2.0.5-hxibemg5qlkizhtzsez2apwjtmtjyvfk/spack-src/libcdi'
>> 1063 make: *** [Makefile:492: all-recursive] Error 1
@junwang-noaa Do I need to use a branch for testing with esmf 8.5.0, mapl 2.40.3, fms 2023.02.01 or will develop work, do you know?
I built ufs-weather-model develop on Hercules against new spack-stack@1.5.1 with gcc@12.2.0, openmpi@4.1.4, fms@2023.02.01, esmf@8.5.0, mapl@2.40.3. I then tried to run cpld_control_p8
and it segfaulted. The modified ufs-weather-model code is in
/work2/noaa/jcsda/dheinzel/ufs-wm-151
and the run directory is
/work2/noaa/stmp/dheinzel/stmp/dheinzel/FV3_RT/rt_2555689/cpld_control_p8_gnu
The rt.sh
command was
./rt.sh -n cpld_control_p8 gnu -e -a gsd-hpcs -c -k 2>&1 | tee log.rt_cpld_control_p8
It failed in MOM
The error is:
160:
160: Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
160:
160: Backtrace for this error:
160: #0 0x14b26e688d8f in ???
160: #1 0x14b26e6fe64d in ???
160: #2 0x3d2fb69 in ???
160: #3 0x3ba012d in ???
160: #4 0x3334fa2 in __mom_io_infra_MOD_read_field_2d
160: at /work2/noaa/jcsda/dheinzel/ufs-wm-151/MOM6-interface/MOM6/config_src/infra/FMS2/MOM_io_infra.F90:905
160: #5 0x303950d in __mom_io_MOD_mom_read_data_2d
160: at /work2/noaa/jcsda/dheinzel/ufs-wm-151/MOM6-interface/MOM6/src/framework/MOM_io.F90:2172
Line 905 in MOM_io_infra.F90 is:
905 call fms2_read_data(fileobj, var_to_read, data)
Commit Queue Requirements:
[ ] Commit 'test_changes.list' from previous step
Description:
Starting with version 10 gfortran treats mismatches between actual and dummy argument lists as errors. This error can be turned into warning (and silenced) by using '-fallow-argument-mismatch' flag. In this PR all such mismatches are fixed. Most of the errors are resolved by using
mpi_f08
module which provides generic interfaces.WARNING: We are currently using mpich MPI library with the gnu compilers on Hera and SGI MPT on Cheyenne. mpi_f08 module in mpich, when compiled with the current versions of the gnu compilers, has some issues and MPT does not provide mpi_f08 module at all. Which means this PR will require us to switch to OpenMPI, which will require hpc-stack to be rebuild on these two platforms. Do we want to do that? Do we want to make (working) mpi_f08 module a requirement for ufs-weather-model?
Commit Message:
Priority:
Git Tracking
UFSWM:
Sub component Pull Requests:
UFSWM Blocking Dependencies:
Changes
Regression Test Changes (Please commit test_changes.list):
Input data Changes:
Library Changes/Upgrades:
Testing Log: