Open junwang-noaa opened 5 months ago
@Hang-Lei-NOAA is 8.6.1 on Acorn/WCOSS2 already?
@Brian Curtis - NOAA Affiliate @.***> This esmf and mapl have been added on acorn, please test /lfs/h1/emc/nceplibs/noscrub/hpc-stack/libs/hpc-stack/modulefiles/mpi/intel/19.1.3.304/cray-mpich/8.1.9/esmf/8.6.1
On Mon, Jul 1, 2024 at 7:48 AM Brian Curtis @.***> wrote:
@Hang-Lei-NOAA https://github.com/Hang-Lei-NOAA is 8.6.1 on Acorn/WCOSS2 already?
— Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-weather-model/issues/2345#issuecomment-2199933561, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKWSMFC7CAAMEVMOIYUVQQTZKE6Z5AVCNFSM6AAAAABKB5YK6GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOJZHEZTGNJWGE . You are receiving this because you were mentioned.Message ID: @.***>
do you want this installed in spack-stack/1.6.0? and on which machine?
@jkbk2004 @junwang-noaa i installed a chained env based on 1.6.0 but with esmf/8.6.1 and mapl/2.46.2 here /work/noaa/epic/role-epic/spack-stack/hercules/spack-stack-1.6.0/envs/ue-esmf-8.6.1-mapl-2.46.2/install/modulefiles/Core
. it is intel only for now. please give a try and let us know how it works with the ufs-wm.
@ulmononian Thanks, those libraries need to be tested in ufs-weather-model
@jkbk2004 @junwang-noaa: thanks to @climbfuji, the gcc/mvapich2 stack w/ mapl/2.46.2 and esmf/8.6.1 is now available in the same chained env. i mentioned in my previous comment, i.e.: /work/noaa/epic/role-epic/spack-stack/hercules/spack-stack-1.6.0/envs/ue-esmf-8.6.1-mapl-2.46.2/install/modulefiles/Core
(compiler/mpi the same as in the ufs hercules gnu modulefile).
@ulmononian @RatkoVasic-NOAA @junwang-noaa
CMake Error at /work/noaa/epic/role-epic/spack-stack/hercules/spack-stack-1.6.0/envs/ue-esmf-8.6.1-mapl-2.46.2/install/intel/2021.9.0/mapl-2.46.2-uiwt3at/lib64/cmake/MAPL/MAPL-targets.cmake:73 (set_target_properties):
The link interface of target "MAPL_cfio_r4" contains:
ESMF::ESMF
but the target was not found. Possible reasons include:
* There is a typo in the target name.
* A find_package call is missing for an IMPORTED target.
* An ALIAS target is missing.
With ESMF 8.6.1 and MAPL 2.46.2 the UFSWM fails compiles for: s2swa_32bit_intel s2swa_32bit_pdlib_intel s2swa_32bit_pdlib_sfs_intel s2swa_32bit_pdlib_debug_intel s2swa_intel s2swa_faster_intel atmaero_intel
@Hang-Lei-NOAA Can you please install esmf 8.6.1 and mapl 2.46.3 on acorn using spack-stack-1.6.0
@Dusan Jovic - NOAA Affiliate @.***> we will get this coordinated and done.
On Tue, Aug 20, 2024 at 12:55 PM Dusan Jovic @.***> wrote:
@Hang-Lei-NOAA https://github.com/Hang-Lei-NOAA Can you please install esmf 8.6.1 and mapl 2.46.3 on acorn using spack-stack-1.6.0
— Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-weather-model/issues/2345#issuecomment-2299320671, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKWSMFB5Z5CPLTMJWARL2MLZSNYG3AVCNFSM6AAAAABKB5YK6GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEOJZGMZDANRXGE . You are receiving this because you were mentioned.Message ID: @.***>
MAPL team released MAPL v2.40.3.1 that works with ESMF 8.61 in UFS weather model and has fix for issue #2162. I think we can move on to update ESMF to v8.6.1 in UFS weather model. Additional work needs to be done to update MAPL and GOCART to the latest version.
@JacobCarley-NOAA @edwardhartnett @AlexanderRichert-NOAA @Hang-Lei-NOAA @RatkoVasic-NOAA FYI.
Great, I will add this to dogwoods for testing.
On Fri, Nov 15, 2024 at 12:22 PM Jun Wang @.***> wrote:
MAPL team released MAPL v2.40.3.1 that works with ESMF 8.61 in UFS weather model and has fix for issue #2162 https://github.com/ufs-community/ufs-weather-model/issues/2162. I think we can move on to update ESMF to v8.6.1 in UFS weather model. Additional work needs to be done to update MAPL and GOCART to the latest version.
@JacobCarley-NOAA https://github.com/JacobCarley-NOAA @edwardhartnett https://github.com/edwardhartnett @AlexanderRichert-NOAA https://github.com/AlexanderRichert-NOAA @Hang-Lei-NOAA https://github.com/Hang-Lei-NOAA @RatkoVasic-NOAA https://github.com/RatkoVasic-NOAA FYI.
— Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-weather-model/issues/2345#issuecomment-2479493829, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKWSMFBR5IUWFLD4Y7M22OL2AYUWFAVCNFSM6AAAAABKB5YK6GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINZZGQ4TGOBSHE . You are receiving this because you were mentioned.Message ID: @.***>
@junwang-noaa @DusanJovic-NOAA @lipan-NOAA The new set has been installed on dogwoods for testing:
module load PrgEnv-intel module load intel module load craype module load cray-mpich module use /lfs/h2/emc/eib/save/Hang.Lei/forgdit/nco_wcoss2/install/modulefiles/mpi/intel/19.1.3.304/cray-mpich/8.1.9 module load hdf5/1.14.0 module load netcdf/4.9.2 module load pnetcdf/1.12.2 module load pio/2.5.10 module load fms/2024.01 module load esmf/8.6.1 module load mapl/2.40.3.1-esmf-8.6.1
@junwang-noaa @jkbk2004 just want to make sure i understand: is the proposed path using mapl/2.40.3.1 w/ esmf/8.6.1 in spack-stack/1.6.0?
I'm now installing ue-esmf-8.6.1-mapl-2.40.3.1 on Hercules and Hera...
@junwang-noaa @DusanJovic-NOAA you can start testing.
If all OK, I can start porting to other machines (Jet, Gaea5/6, Orion, Derecho and NOAA cloud x3)
@junwang-noaa @DusanJovic-NOAA you can start testing.
- Hercules (Intel and GNU): /work/noaa/epic/role-epic/spack-stack/hercules/spack-stack-1.6.0/envs/ue-esmf-8.6.1-mapl-2.40.3.1/install/modulefiles/Core/ load stack-gcc/12.2.0 or stack-intel/2021.9.0
- Hera (Intel): /scratch1/NCEPDEV/nems/role.epic/spack-stack/spack-stack-1.6.0/envs/ue-esmf-8.6.1-mapl-2.40.3.1/install/modulefiles/Core/ load stack-intel/2021.5.0
- Hera (GNU): /scratch2/NCEPDEV/stmp1/role.epic/spack-stack/spack-stack-1.6.0_gnu13/envs/esmf-8.6.1-mapl-2.40.3.1-addon/install/modulefiles/Core load stack-gcc/13.3.0
- Orion (Intel): /work/noaa/epic/role-epic/spack-stack/orion/spack-stack-1.6.0/envs/esmf-8.6.1-mapl-2.40.3.1-addon/install/modulefiles/Core/ load stack-intel/2021.9.0
If all OK, I can start porting to other machines (Jet, Gaea5/6, Orion, Derecho and NOAA cloud x3)
On Hercules regression test passed. However, on Hera, I get this error:
$ module load ufs_hera.intel
Lmod has detected the following error: These module(s) or extension(s) exist but cannot be loaded as requested: "mapl/2.40.3.1-esmf-8.6.1"
Try: "module spider mapl/2.40.3.1-esmf-8.6.1" to see how to load the module(s).
@junwang-noaa @DusanJovic-NOAA you can start testing.
- Hercules (Intel and GNU): /work/noaa/epic/role-epic/spack-stack/hercules/spack-stack-1.6.0/envs/ue-esmf-8.6.1-mapl-2.40.3.1/install/modulefiles/Core/ load stack-gcc/12.2.0 or stack-intel/2021.9.0
- Hera (Intel): /scratch1/NCEPDEV/nems/role.epic/spack-stack/spack-stack-1.6.0/envs/ue-esmf-8.6.1-mapl-2.40.3.1/install/modulefiles/Core/ load stack-intel/2021.5.0
- Hera (GNU): /scratch2/NCEPDEV/stmp1/role.epic/spack-stack/spack-stack-1.6.0_gnu13/envs/esmf-8.6.1-mapl-2.40.3.1-addon/install/modulefiles/Core load stack-gcc/13.3.0
- Orion (Intel): /work/noaa/epic/role-epic/spack-stack/orion/spack-stack-1.6.0/envs/esmf-8.6.1-mapl-2.40.3.1-addon/install/modulefiles/Core/ load stack-intel/2021.9.0
If all OK, I can start porting to other machines (Jet, Gaea5/6, Orion, Derecho and NOAA cloud x3)
On Hercules regression test passed. However, on Hera, I get this error:
$ module load ufs_hera.intel Lmod has detected the following error: These module(s) or extension(s) exist but cannot be loaded as requested: "mapl/2.40.3.1-esmf-8.6.1" Try: "module spider mapl/2.40.3.1-esmf-8.6.1" to see how to load the module(s).
Same thing on Orion, 'mapl/2.40.3.1-esmf-8.6.1' is missing.
@DusanJovic-NOAA send me your modulefiles on Hera, I'll check it out.
@DusanJovic-NOAA send me your modulefiles on Hera, I'll check it out.
$ cat /scratch2/NCEPDEV/fv3-cam/Dusan.Jovic/ufs/remove_findesmf/ufs-weather-model/modulefiles/ufs_hera.intel.lua
help([[
loads UFS Model prerequisites for Hera/Intel
]])
prepend_path("MODULEPATH", "/scratch1/NCEPDEV/nems/role.epic/spack-stack/spack-stack-1.6.0/envs/ue-esmf-8.6.1-mapl-2.40.3.1/install/modulefiles/Core")
stack_intel_ver=os.getenv("stack_intel_ver") or "2021.5.0"
load(pathJoin("stack-intel", stack_intel_ver))
stack_impi_ver=os.getenv("stack_impi_ver") or "2021.5.1"
load(pathJoin("stack-intel-oneapi-mpi", stack_impi_ver))
cmake_ver=os.getenv("cmake_ver") or "3.23.1"
load(pathJoin("cmake", cmake_ver))
load("ufs_common")
nccmp_ver=os.getenv("nccmp_ver") or "1.9.0.1"
load(pathJoin("nccmp", nccmp_ver))
setenv("CC", "mpiicc")
setenv("CXX", "mpiicpc")
setenv("FC", "mpiifort")
setenv("CMAKE_Platform", "hera.intel")
whatis("Description: UFS build environment")
@DusanJovic-NOAA for some reason modulefile didn't get prefix from esmf version. Can you try for now in ufs_common.lua load just mapl/2.40.3.1:
-- {["mapl"] = "2.40.3.1-esmf-8.6.1"},
{["mapl"] = "2.40.3.1"},
We can look later why extension is not correct in modulefile name.
I'm pretty sure the MAPL entry in modules.yaml has to be updated for each ESMF version (sorry if I forgot to update...).
I see @AlexanderRichert-NOAA it should be here:
mapl:
suffixes:
# Keeping this as a reminder how to do snapshots
#^esmf@8.3.0b09~debug snapshot=b09: 'esmf-8.3.0b09'
#^esmf@8.3.0b09+debug snapshot=b09: 'esmf-8.3.0b09-debug'
^esmf@8.4.2~debug snapshot=none: 'esmf-8.4.2'
^esmf@8.4.2+debug snapshot=none: 'esmf-8.4.2-debug'
^esmf@8.5.0~debug snapshot=none: 'esmf-8.5.0'
^esmf@8.5.0+debug snapshot=none: 'esmf-8.5.0-debug'
^esmf@8.6.0~debug snapshot=none: 'esmf-8.6.0'
^esmf@8.6.0+debug snapshot=none: 'esmf-8.6.0-debug'
I'll fix it later after @DusanJovic-NOAA test it.
Just tested hang's library on WCOSS2 and it works fine for me @Hang-Lei-NOAA
I see @AlexanderRichert-NOAA it should be here:
mapl: suffixes: # Keeping this as a reminder how to do snapshots #^esmf@8.3.0b09~debug snapshot=b09: 'esmf-8.3.0b09' #^esmf@8.3.0b09+debug snapshot=b09: 'esmf-8.3.0b09-debug' ^esmf@8.4.2~debug snapshot=none: 'esmf-8.4.2' ^esmf@8.4.2+debug snapshot=none: 'esmf-8.4.2-debug' ^esmf@8.5.0~debug snapshot=none: 'esmf-8.5.0' ^esmf@8.5.0+debug snapshot=none: 'esmf-8.5.0-debug' ^esmf@8.6.0~debug snapshot=none: 'esmf-8.6.0' ^esmf@8.6.0+debug snapshot=none: 'esmf-8.6.0-debug'
I'll fix it later after @DusanJovic-NOAA test it.
Tests passed on Hera.
The regression test on Dogwood failed, with these tests failing at output comparison:
SYNOPSIS:
Starting Date/Time: 20241119 21:37:08
Ending Date/Time: 20241119 23:15:06
Total Time: 01h:38m:37s
Compiles Completed: 33/33
Tests Completed: 139/156
Failed Tests:
* TEST cpld_control_p8_mixedmode_intel: FAILED: UNABLE TO COMPLETE COMPARISON
-- LOG: /lfs/h2/emc/eib/noscrub/dusan.jovic/ufs/remove_findesmf/ufs-weather-model/tests/logs/log_wcoss2/run_cpld_control_p8_mixedmode_intel.log
* TEST cpld_control_p8_intel: FAILED: UNABLE TO COMPLETE COMPARISON
-- LOG: /lfs/h2/emc/eib/noscrub/dusan.jovic/ufs/remove_findesmf/ufs-weather-model/tests/logs/log_wcoss2/run_cpld_control_p8_intel.log
* TEST cpld_control_p8.v2.sfc_intel: FAILED: UNABLE TO COMPLETE COMPARISON
-- LOG: /lfs/h2/emc/eib/noscrub/dusan.jovic/ufs/remove_findesmf/ufs-weather-model/tests/logs/log_wcoss2/run_cpld_control_p8.v2.sfc_intel.log
* TEST cpld_restart_p8_intel: FAILED: UNABLE TO START TEST
-- LOG: N/A
* TEST cpld_control_qr_p8_intel: FAILED: UNABLE TO COMPLETE COMPARISON
-- LOG: /lfs/h2/emc/eib/noscrub/dusan.jovic/ufs/remove_findesmf/ufs-weather-model/tests/logs/log_wcoss2/run_cpld_control_qr_p8_intel.log
* TEST cpld_restart_qr_p8_intel: FAILED: UNABLE TO START TEST
-- LOG: N/A
* TEST cpld_2threads_p8_intel: FAILED: UNABLE TO COMPLETE COMPARISON
-- LOG: /lfs/h2/emc/eib/noscrub/dusan.jovic/ufs/remove_findesmf/ufs-weather-model/tests/logs/log_wcoss2/run_cpld_2threads_p8_intel.log
* TEST cpld_decomp_p8_intel: FAILED: UNABLE TO COMPLETE COMPARISON
-- LOG: /lfs/h2/emc/eib/noscrub/dusan.jovic/ufs/remove_findesmf/ufs-weather-model/tests/logs/log_wcoss2/run_cpld_decomp_p8_intel.log
* TEST cpld_mpi_p8_intel: FAILED: UNABLE TO COMPLETE COMPARISON
-- LOG: /lfs/h2/emc/eib/noscrub/dusan.jovic/ufs/remove_findesmf/ufs-weather-model/tests/logs/log_wcoss2/run_cpld_mpi_p8_intel.log
* TEST cpld_control_ciceC_p8_intel: FAILED: UNABLE TO COMPLETE COMPARISON
-- LOG: /lfs/h2/emc/eib/noscrub/dusan.jovic/ufs/remove_findesmf/ufs-weather-model/tests/logs/log_wcoss2/run_cpld_control_ciceC_p8_intel.log
* TEST cpld_bmark_p8_intel: FAILED: UNABLE TO COMPLETE COMPARISON
-- LOG: /lfs/h2/emc/eib/noscrub/dusan.jovic/ufs/remove_findesmf/ufs-weather-model/tests/logs/log_wcoss2/run_cpld_bmark_p8_intel.log
* TEST cpld_restart_bmark_p8_intel: FAILED: UNABLE TO START TEST
-- LOG: N/A
* TEST cpld_s2sa_p8_intel: FAILED: UNABLE TO COMPLETE COMPARISON
-- LOG: /lfs/h2/emc/eib/noscrub/dusan.jovic/ufs/remove_findesmf/ufs-weather-model/tests/logs/log_wcoss2/run_cpld_s2sa_p8_intel.log
* TEST cpld_control_p8_faster_intel: FAILED: UNABLE TO COMPLETE COMPARISON
-- LOG: /lfs/h2/emc/eib/noscrub/dusan.jovic/ufs/remove_findesmf/ufs-weather-model/tests/logs/log_wcoss2/run_cpld_control_p8_faster_intel.log
* TEST atmaero_control_p8_intel: FAILED: UNABLE TO COMPLETE COMPARISON
-- LOG: /lfs/h2/emc/eib/noscrub/dusan.jovic/ufs/remove_findesmf/ufs-weather-model/tests/logs/log_wcoss2/run_atmaero_control_p8_intel.log
* TEST atmaero_control_p8_rad_intel: FAILED: UNABLE TO COMPLETE COMPARISON
-- LOG: /lfs/h2/emc/eib/noscrub/dusan.jovic/ufs/remove_findesmf/ufs-weather-model/tests/logs/log_wcoss2/run_atmaero_control_p8_rad_intel.log
* TEST atmaero_control_p8_rad_micro_intel: FAILED: UNABLE TO COMPLETE COMPARISON
-- LOG: /lfs/h2/emc/eib/noscrub/dusan.jovic/ufs/remove_findesmf/ufs-weather-model/tests/logs/log_wcoss2/run_atmaero_control_p8_rad_micro_intel.log
@lipan-NOAA Did you run full tests on WCOSS2, or just a subset of tests?
@DusanJovic-NOAA On cpld_bmark_p8 only, my focus is whether the run completed successfully, not whether the results are the same as the baseline. Interestingly, the run is slower than before, and I had to increase the wall clock time to complete this test. I'm not sure if this is caused by Dogwood.
My test using Dusan's ufs get the same result on these regrssional tests. Some files are identical, but some are not in these cases.
On Tue, Nov 19, 2024 at 7:39 PM lipan-NOAA @.***> wrote:
@DusanJovic-NOAA https://github.com/DusanJovic-NOAA On cpld_bmark_p8 only, my focus is whether the run completed successfully, not whether the results are the same as the baseline. Interestingly, the run is slower than before, and I had to increase the wall clock time to complete this test. I'm not sure if this is caused by Dogwood.
— Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-weather-model/issues/2345#issuecomment-2487053036, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKWSMFG3BC3XPC5TNE66YVL2BPK4VAVCNFSM6AAAAABKB5YK6GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIOBXGA2TGMBTGY . You are receiving this because you were mentioned.Message ID: @.***>
Tier 1 machines that have esmf-8.6.1-mapl-2.40.3.1-addon
addons installed:
In order to have same name of chained environments across the platforms, names are changed in Hera and Hercules from ue-esmf-8.6.1-mapl-2.40.3.1
to esmf-8.6.1-mapl-2.40.3.1-addon
(I'll still keep copy with old name for some time so test scripts should work)
On Gaea:
$ module load ufs_gaea.intel
Lmod is automatically replacing "intel/2023.2.0" with "intel-classic/2023.2.0".
Lmod has detected the following error: The following module(s) are unknown:
"mapl/2.40.3.1-esmf-8.6.1"
module is not named mapl/2.40.3.1-esmf-8.6.1
$ ls -l /ncrc/proj/epic/spack-stack/spack-stack-1.6.0/envs/esmf-8.6.1-mapl-2.40.3.1-addon/install/modulefiles/cray-mpich/8.1.28/intel/2023.2.0/mapl/
total 4
-rw-r--r-- 1 role.epic wpo 3889 Nov 20 12:22 2.40.3.1.lua
module is not named
mapl/2.40.3.1-esmf-8.6.1
Fixed.
Description
ESMF 8.6.1 is released. In this version, ESMF Config was enhanced to remove the single line limitation of 1024 characters. This new feature is needed to run long forecast (SFS) with selected output time. UFS WM needs to be tested and updated with this ESMF version.
Solution
Alternatives
Related to