Closed uturuncoglu closed 4 months ago
@junwang-noaa I created this issue to track installation of new ESMF tag which is required for the external land component and it is required for the next PR related with it. I am not expecting next external land component PR soon but it would be nice to start thinking about it since installation of new ESMF tag could take time. I think that will be handled by the EPIC team but I am not sure. Let me know what do you think?
@uturuncoglu Thanks for creating the issue. We are currently getting the ESMF 840 release version installed and used in ufs WM as we have the operational code freeze coming soon and only release version is accepted in operation. We can ask EPIC team to install some test version ESMF v8.5.0b10 on R&D platform for this external land component work.
@junwang-noaa Thanks. I think once operational code freezing is passed. The UFS model could start using beta snapshots again. Right?
I think so.
@junwang-noaa is there any update about it? What about the operational code freeze. Is it done? Once this will available I am plaining to replace the I/O layer in the land component.
No, not yet. We are waiting for HR1 testing before we create a tag. @jkbk2004 can your team install ESMF v8.5.0b10 library on hera? Thanks
@junwang-noaa Thanks for the update. I think NCAR's Cheyenne will be better since I have no access to Hera.
@uturuncoglu We can coordinate thru EPIC on cheyenne.
I will give a try to install 8.5.0b10 on Cheyenne over weekend.
@jkbk2004 is there any progress on this? thanks.
@jkbk2004 is there any progress on this? thanks.
@uturuncoglu give a try 8.5.0b10 installed at /glade/work/epicufsrt/GMTB/tools/intel/2022.1/hpc-stack-v1.2.0_6eb6/modulefiles/stack. Last week was busy one for program increment planning. Let me know
@jkbk2004 Thanks for your help. I tried to compile the model with new version of ESMF and I am getting following error from the link step,
/usr/lib64/gcc/x86_64-suse-linux/4.8/../../../../x86_64-suse-linux/bin/ld: /glade/work/epicufsrt/GMTB/tools/intel/2022.1/hpc-stack-v1.2.0_6eb6/intel-2022.1/mpt-2.25/esmf/8.5.0b10/lib/libpioc.a(pioc.c.o): in function `PIOc_iosystem_is_active':
/glade/work/jongkim/stacks/hash/hpc-stack-6eb6/pkg/v8.5.0b10/src/Infrastructure/IO/PIO/ParallelIO/src/clib/pioc.c:97: multiple definition of `PIOc_iosystem_is_active'; /glade/work/epicufsrt/GMTB/tools/intel/2022.1/hpc-stack-v1.2.0_6eb6/intel-2022.1/mpt-2.25/pio/2.5.7/lib/libpioc.a(pioc.c.o):/glade/work/jongkim/stacks/hash/hpc-stack-6eb6/pkg/pio-2.5.7/src/clib/pioc.c:97: first defined here
I think that ESMF 8.5.0b10 is using its own internal PIO library and this is conflicting with the external installation due to the version differences maybe. Is it possible to install ESMF by pointing external PIO. So, it would not cause a conflict. It seems that ESMF is using 2.5.10. You could set following variables for it,
export ESMF_PIO="external"
export ESMF_PIO_LIBPATH=$PIO_LIBDIR
export ESMF_PIO_INCLUDE=$PIO_INCDIR
I wonder if UFS tested with the ESMF Version > 8.3.0b09 before.
@jkbk2004 Hi, I just want to check the current status of this installation. Thanks.
@jkbk2004 Hi, I just want to check the current status of this installation. Thanks.
@uturuncoglu I will take a look. I will get back to you tomorrow.
@jkbk2004 Thank you. It is not super urgent but it would be nice to have it soon since I am planing to put restructured I/O code that leverages from ESMF multi-tile support to Noah-MP.
@jkbk2004 Thank you. It is not super urgent but it would be nice to have it soon since I am planing to put restructured I/O code that leverages from ESMF multi-tile support to Noah-MP.
I tried ... but it sounds like an issue make chkdir_apps make[5]: Entering directory '/glade/work/jongkim/stacks/hash/hpc-stack-6eb6/pkg/v8.5.0b10/src/apps/ESMF_PrintInfo' make[5]: Leaving directory '/glade/work/jongkim/stacks/hash/hpc-stack-6eb6/pkg/v8.5.0b10/src/apps/ESMF_PrintInfo' mpif90 -m64 -mcmodel=small -pthread -threads -cxxlib -Wl,--no-as-needed -qopenmp -L/glade/work/epicufsrt/GMTB/tools/intel/2022.1/hpc-stack-v1.2.0_6eb6/intel-2022.1/mpt-2.25/hdf5/1.10.6/lib -L/glade/work/epicufsrt/GMTB/tools/intel/2022.1/hpc-stack-v1.2.0_6eb6/intel-2022.1/zlib/1.2.11/lib -L/glade/work/epicufsrt/GMTB/tools/intel/2022.1/hpc-stack-v1.2.0_6eb6/intel-2022.1/mpt-2.25/esmf/8.5.0b10/lib -L/glade/work/epicufsrt/GMTB/tools/intel/2022.1/hpc-stack-v1.2.0_6eb6/intel-2022.1/mpt-2.25/netcdf/4.7.4/lib -L/glade/work/epicufsrt/GMTB/tools/intel/2022.1/hpc-stack-v1.2.0_6eb6/intel-2022.1/mpt-2.25/pio/2.5.7/lib -Wl,-rpath,/glade/work/epicufsrt/GMTB/tools/intel/2022.1/hpc-stack-v1.2.0_6eb6/intel-2022.1/mpt-2.25/esmf/8.5.0b10/lib -Wl,-rpath,/glade/work/epicufsrt/GMTB/tools/intel/2022.1/hpc-stack-v1.2.0_6eb6/intel-2022.1/mpt-2.25/netcdf/4.7.4/lib -Wl,-rpath,/glade/work/epicufsrt/GMTB/tools/intel/2022.1/hpc-stack-v1.2.0_6eb6/intel-2022.1/mpt-2.25/pio/2.5.7/lib -o /glade/work/epicufsrt/GMTB/tools/intel/2022.1/hpc-stack-v1.2.0_6eb6/intel-2022.1/mpt-2.25/esmf/8.5.0b10/bin/ESMF_PrintInfo /glade/work/jongkim/stacks/hash/hpc-stack-6eb6/pkg/v8.5.0b10/obj/objO/Linux.intel.64.mpt.default/src/apps/ESMF_PrintInfo/ESMF_PrintInfo.o -lesmf -lmpi++ -lrt -ldl -lnetcdff -lnetcdf -lhdf5_hl -lhdf5 -lz -ldl -lm -lpioc /usr/lib64/gcc/x86_64-suse-linux/4.8/../../../../x86_64-suse-linux/bin/ld: /glade/work/epicufsrt/GMTB/tools/intel/2022.1/hpc-stack-v1.2.0_6eb6/intel-2022.1/mpt-2.25/esmf/8.5.0b10/lib/libesmf.so: undefined reference to
PIOc_InitDecomp_ReadOnly'
/glade/work/jongkim/stacks/hash/hpc-stack-6eb6/pkg/v8.5.0b10/build/common.mk:2583: recipe for target '/glade/work/epicufsrt/GMTB/tools/intel/2022.1/hpc-stack-v1.2.0_6eb6/intel-2022.1/mpt-2.25/esmf/8.5.0b10/bin/ESMF_PrintInfo' failed
make[4]: *** [/glade/work/epicufsrt/GMTB/tools/intel/2022.1/hpc-stack-v1.2.0_6eb6/intel-2022.1/mpt-2.25/esmf/8.5.0b10/bin/ESMF_PrintInfo] Error 1`
I was using pio/2.5.7 installed already at /glade/work/epicufsrt/GMTB/tools/intel/2022.1/hpc-stack-v1.2.0_6eb6/modulefiles/stack
Yeah, we need pio-2.5.8 that has PIOc_InitDecomp_ReadOnly
let me try again with pio-2.5.8
@uturuncoglu It did go thru with pio-2.5.8. Give a try one more time with module path https://github.com/ufs-community/ufs-weather-model/blob/develop/modulefiles/ufs_cheyenne.intel.lua
@jkbk2004 Thanks for your help. I could able to compile the model with pio 2.5.8 and esmf 8.5.0b10. I'll try to update my fork with new I/O later that uses ESMF multi-tile support to see what happens. I'll update you soon.
@jkbk2004 I confirm that it is working without any issue. BTW, do we have also GNU version on Cheyenne. It would be nice to test new I/O implementation under GNU too to see any possible issues.
@uturuncoglu In CMEPS, there is a routine in med.F90 called med_grid_write
, which is limited right now to tileCount=1. Will the new I/O features allow tileCount>1 in this routine?
@jkbk2004 I confirm that it is working without any issue. BTW, do we have also GNU version on Cheyenne. It would be nice to test new I/O implementation under GNU too to see any possible issues. @uturuncoglu sure! I will install them on gnu as well. I will keep you posted: maybe sometime this afternoon.
@DeniseWorthen I think we could try to remove that restriction with the recent update in ESMF side. I am currently working on restructuring I/O later in Noah-MP component model. Once I have finalized that one, I could try to test it on CMEPS.
@uturuncoglu we migrated cheyenne hpc stack locations yesterday. Old ones still available. I want to follow up again with new locations. @natalie-perlin can you install esmf-8.5.0b10 on cheyenne? it needs pio-2.5.8 (read conversation above). Please, give a priority. Installation itself goes thru quickly.
@jkbk2004 @natalie-perlin You mean the module locations are changed? BTW, last tag is v8.5.0b14 and also has couple of fix related with I/O but I think it requires pio-2.5.10. Anyway, we could also stick to the esmf-8.5.0b10 and pio-2.5.8 for both Intel and GNU.
@jkbk2004 @natalie-perlin You mean the module locations are changed? BTW, last tag is v8.5.0b14 and also has couple of fix related with I/O but I think it requires pio-2.5.10. Anyway, we could also stick to the esmf-8.5.0b10 and pio-2.5.8 for both Intel and GNU.
@uturuncoglu Yeah, we made location changes at weather model develop branch yesterday. But you can stay with old one. Let me install esmf-8.5.0b10 and pio-2.5.8 gnu to old location now. I will let you know in an hour or so.
@uturuncoglu give a try gnu at /glade/work/epicufsrt/GMTB/tools/gnu/10.1.0/hpc-stack-v1.2.0/modulefiles/stack. I installed esmf-8.5.0b10 there.
@jkbk2004 okay. thanks for your help.
@jkbk2004 @uturuncoglu -
Currently installing pio-2.5.8 and esmf-8.5.0b10 in standard (updated yesterday) locations on cheyenne
, for intel/2022.1 and gnu/10.1.
@jkbk2004 @uturuncoglu - done for Cheyenne, installed pio-2.5.8 and esmf-8.5.0b10 /glade/work/epicufsrt/contrib/hpc-stack/gnu10.1.0/ and /glade/work/epicufsrt/contrib/hpc-stack/intel2022.1/
@natalie-perlin Thank you very much. I'll try gnu later today.
@natalie-perlin @jkbk2004 It turns out that there is a memory corruption bug in esmf-8.5.0b10. So, it would be nice to have esmf-8.5.0b17 with pio-2.5.10 on Cheyyene. I just wonder if it is possible to install it? Thanks.
BTW, i am also getting error from previous GNU installation like following,
Lmod is automatically replacing "intel/19.1.1" with "gnu/10.1.0".
Lmod has detected the following error: The following module(s) are unknown:
"hpc-mpt/2.22"
Please check the spelling or version number. Also try "module spider ..."
It is also possible your cache file is out-of-date; it may help to try:
$ module --ignore_cache load "hpc-mpt/2.22"
Also make sure that all modulefiles written in TCL start with the string
#%Module
Executing this command requires loading "hpc-mpt/2.22" which failed while
processing the following module(s):
Module fullname Module Filename
--------------- ---------------
ufs_cheyenne.gnu /glade/work/turuncu/NOAHMP/ufs-weather-model_dev/modulefiles/ufs_cheyenne.gnu.luaExecuting this command requires loading "pio/2.5.8" which failed while
processing the following module(s):
Module fullname Module Filename
--------------- ---------------
ufs_common /glade/work/turuncu/NOAHMP/ufs-weather-model_dev/modulefiles/ufs_common.lua
ufs_cheyenne.gnu /glade/work/turuncu/NOAHMP/ufs-weather-model_dev/modulefiles/ufs_cheyenne.gnu.lua
@uturuncoglu - sure, will take care of esmf-8.5.0b17 with pio-2.5.10.
I have some idea why there could be gnu/10.1.0 complaints. When do you get these error reports?
The default for cheyenne is the intel/19.x.x compiler. After the Lmod initialization, all the default modules are loaded, and then replaced by those needed for a particular stack.
The error related to GNU is coming when I try to build UFS.
@natalie-perlin BTW, thanks for your help.
@uturuncoglu - I was able to compile the UFS-WM on Cheyenne with gnu/10.1.0 with no issues. That's what my steps were:
git clone https://github.com/ufs-community/ufs-weather-model.git ufs-wm-dev-gnu10
.1
cd ufs-wm-dev-gnu10.1
git submodule update --init --recursive
module use modulefiles
module load ufs_cheyenne.gnu
export CMAKE_FLAGS="-DAPP=ATM -DCCPP_SUITES=FV3_GFS_v16"
export BUILD_VERBOSE=1
./build.sh 2>&1 | tee build.log1
You could view a log file on Cheyenne in /glade/scratch/nperlin/UFS-WM/ufs-wm-dev-gnu10.1/build.log1
This hpc-stack /glade/work/epicufsrt/contrib/hpc-stack/gnu10.1.0/modulefiles/stack
is build with mpt/2.25. The error that shows up in your snippet has a reference to mpt/2.22. Was it maybe some earlier builds? When did you receive this error?
@uturuncoglu - could it be due to the modulefiles/ufs_cheyenne.gnu.lua not updated? In that modulefile, one line that needs update is for the location of the stack, and another line needs update of the hpc-mpt/2.25 module loaded instead of the hpc-mpt/2.22.
@natalie-perlin Could be the reason. My UFS version is not the most recent but I could try to sync again and test.
@natalie-perlin I moved to mpt/2.25 for GNU and now I am getting following error,
Lmod is automatically replacing "intel/19.1.1" with "gnu/10.1.0".
Lmod has detected the following error: Cannot load module
"mapl/2.22.0-esmf-8.3.0b09". At least one of these module(s) must be loaded:
esmf/8.3.0b09 esmf/8.3.0b09-debug
While processing the following module(s):
Module fullname Module Filename
--------------- ---------------
mapl/2.22.0-esmf-8.3.0b09 /glade/work/epicufsrt/contrib/hpc-stack/gnu10.1.0/modulefiles/mpi/gnu/10.1.0/mpt/2.25/mapl/2.22.0-esmf-8.3.0b09.lua
ufs_common /glade/work/turuncu/NOAHMP/ufs-weather-model_dev/modulefiles/ufs_common.lua
ufs_cheyenne.gnu /glade/work/turuncu/NOAHMP/ufs-weather-model_dev/modulefiles/ufs_cheyenne.gnu.lua
I think mail is also depend on used ESMF version.
@natalie-perlin I am not sure how Intel is working with
mapl_ver=os.getenv("mapl_ver") or "2.22.0-esmf-8.3.0b09"
load(pathJoin("mapl", mapl_ver))
entry in the modulefiles/ufs_common.lua
.
@uturuncoglu - looking into installation of esmf-8.5.0b17 and pio-2.5.10. Is there a requirement of the higher hdf5 and netcdf version?
@uturuncoglu
pio/2.5.10 + esmf/8.5.0b17 + mapl/2.22-esmf-8.5.0b17 are ready on Cheyenne for gnu/10.1.0 stack, in
/glade/work/epicufsrt/contrib/hpc-stack/gnu10.1.0
We have hdf5/1.14.0 and netcdf-c/4.9.1+netcdf-fortran/4.6.0 installed successfully in other locations/ Hera. Let me know if you need esmf and mapl built with these higher hdf5+netcdf versions. (Just to keep in mind that I would then need to clear the current installations of esmf/8.5.0b17 + mapl/2.22-esmf-8.5.0b17, which are build with hdf5/1.10.6 and netcdf/4.7.4.)
@natalie-perlin sorry for late response. I was sick whole the week and I am starting to work slowly again. I'll test the GNU esmf-8.5.0b17. I don't have newer version of those libraries. If I could also have INTEL version that would be great and sufficient for me. Thanks again for kind help.
@uturuncoglu -
Installed for hpc-stack with intel/2022.1
, on Cheyenne:
pio/2.5.10 + esmf/8.5.0b17 + mapl/2.22-esmf-8.5.0b17
in/glade/work/epicufsrt/contrib/hpc-stack/intel2022.1/
Load with module use /glade/work/epicufsrt/contrib/hpc-stack/intel2022.1/modulefiles/stack module load hpc/1.2.0
@natalie-perlin Thanks for your help. I am getting following error from mapl module,
Lmod has detected the following error: The following module(s) are unknown: "mapl/2.22-esmf-8.5.0b17"
Please check the spelling or version number. Also try "module spider ..."
It is also possible your cache file is out-of-date; it may help to try:
$ module --ignore_cache load "mapl/2.22-esmf-8.5.0b17"
Also make sure that all modulefiles written in TCL start with the string #%Module
Executing this command requires loading "mapl/2.22-esmf-8.5.0b17" which failed while processing the following module(s):
Module fullname Module Filename
--------------- ---------------
ufs_common /glade/work/turuncu/NOAHMP/ufs-weather-model_dev/modulefiles/ufs_common.lua
ufs_cheyenne.intel /glade/work/turuncu/NOAHMP/ufs-weather-model_dev/modulefiles/ufs_cheyenne.intel.lua
@natalie-perlin same also for GNU
Lmod has detected the following error: The following module(s) are unknown: "hpc-mpt/2.22" "mapl/2.22-esmf-8.5.0b17"
Please check the spelling or version number. Also try "module spider ..."
It is also possible your cache file is out-of-date; it may help to try:
$ module --ignore_cache load "hpc-mpt/2.22" "mapl/2.22-esmf-8.5.0b17"
Also make sure that all modulefiles written in TCL start with the string #%Module
Executing this command requires loading "hpc-mpt/2.22" which failed while processing the following module(s):
Module fullname Module Filename
--------------- ---------------
ufs_cheyenne.gnu /glade/work/turuncu/NOAHMP/ufs-weather-model_dev/modulefiles/ufs_cheyenne.gnu.luaExecuting this command requires loading "mapl/2.22-esmf-8.5.0b17" which failed while processing the following module(s):
Module fullname Module Filename
--------------- ---------------
ufs_common /glade/work/turuncu/NOAHMP/ufs-weather-model_dev/modulefiles/ufs_common.lua
ufs_cheyenne.gnu /glade/work/turuncu/NOAHMP/ufs-weather-model_dev/modulefiles/ufs_cheyenne.gnu.lua
Any idea? Thanks.
@natalie-perlin we are using hpc_mpt_ver=os.getenv("hpc_mpt_ver") or "2.25"
Why does the error complain about mpt 22 ?
Description
We are in a transition on moving from FMS to ESMF to handle multi-tile file access (read/write) under new external land component (NOAHMP) and ESMF tag v8.5.0b10 has all the development in terms of multi-tile file I/O through the PIO.
Solution
Install v8.5.0b10 on supported platforms and update UFS to use this version.
Alternatives
N/A
Related to
Directly reference any issues or PRs in this or other repositories that this is related to, and describe how they are related. N/A