ufs-community / ufs-weather-model

UFS Weather Model
Other
129 stars 238 forks source link

Test ESMF 8.6.1 beta in UFS weather model #2230

Closed junwang-noaa closed 2 weeks ago

junwang-noaa commented 1 month ago

Description

The ESMF 8.6.1 beta has a fix for issue #1121, 1024 character limit is removed. ESMF team is asking the confirmation that this fix resolved the issue reported in UFS weather model.

Solution

1) install a test version of ESMF 8.6.1 beta (https://github.com/esmf-org/esmf/releases/tag/v8.6.1b03) on hera. 2) Run atm only test (e.g. control_p8) test for longer forecast time with output time specified in the output_fh.

Alternatives

Related to

junwang-noaa commented 1 month ago

@jkbk2004 May I ask if EPIC team can install the test library on hera? Thanks

jkbk2004 commented 1 month ago

@RatkoVasic-NOAA can you install https://github.com/esmf-org/esmf/releases/tag/v8.6.1b04 to spack stack 1.6.0 location? Hecules or Hera might be a good starting point for this beta test.

RatkoVasic-NOAA commented 1 month ago

@jkbk2004 new version of esmf is still not in spack:

    # generate chksum with 'spack checksum esmf@x.y.z'
    version("8.6.0", sha256="ed057eaddb158a3cce2afc0712b49353b7038b45b29aee86180f381457c0ebe7")
    version("8.5.0", sha256="acd0b2641587007cc3ca318427f47b9cae5bfd2da8d2a16ea778f637107c29c4")
    version("8.4.2", sha256="969304efa518c7859567fa6e65efd960df2b4f6d72dbf2c3f29e39e4ab5ae594")

It cannot be installed as part of spack-stack.

uturuncoglu commented 1 month ago

@RatkoVasic-NOAA I think you could still install with esmf@=8.6.1b04 syntax in spack side even if it is not in the package.py.

uturuncoglu commented 4 weeks ago

@RatkoVasic-NOAA @junwang-noaa @climbfuji If you need any help about installing beta snapshot from our side, just let us know.

RatkoVasic-NOAA commented 4 weeks ago

@uturuncoglu thanks! I installed esmf-8.6.1b04 on Hercules, under spack-stack-1.6.0 It was without problems. Now we have to install MAPL with new ESMF version.

uturuncoglu commented 4 weeks ago

@RatkoVasic-NOAA That is great. Thanks for the update.

DusanJovic-NOAA commented 4 weeks ago

I'm testing the esmf 8.6.1b04 on Hercules. cpld_cpntrol_p8 fails with this error:

 57: pe=00057 FAIL at line=00510    MAPL_CapGridComp.F90                     <status=41>
121: pe=00121 FAIL at line=00510    MAPL_CapGridComp.F90                     <status=41>
  8: pe=00008 FAIL at line=00510    MAPL_CapGridComp.F90                     <status=41>
  8: pe=00008 FAIL at line=00956    MAPL_CapGridComp.F90                     <status=41>
139: pe=00139 FAIL at line=00510    MAPL_CapGridComp.F90                     <status=41>
139: pe=00139 FAIL at line=00956    MAPL_CapGridComp.F90                     <status=41>
 48: pe=00048 FAIL at line=00510    MAPL_CapGridComp.F90                     <status=41>
DusanJovic-NOAA commented 4 weeks ago

These are the changes I made to current develop branch:

$ git diff 
diff --git a/modulefiles/ufs_common.lua b/modulefiles/ufs_common.lua
index 1f395d97..f05ff8d4 100644
--- a/modulefiles/ufs_common.lua
+++ b/modulefiles/ufs_common.lua
@@ -10,17 +10,17 @@ local ufs_modules = {
   {["netcdf-c"]        = "4.9.2"},
   {["netcdf-fortran"]  = "4.6.0"},
   {["parallelio"]      = "2.5.10"},
-  {["esmf"]            = "8.5.0"},
+  {["esmf"]            = "8.6.1bs4"},
   {["fms"]             = "2023.02.01"},
   {["bacio"]           = "2.4.1"},
   {["crtm"]            = "2.4.0"},
   {["g2"]              = "3.4.5"},
   {["g2tmpl"]          = "1.10.2"},
   {["ip"]              = "4.3.0"},
-  {["sp"]              = "2.3.3"},
+  {["sp"]              = "2.5.0"},
   {["w3emc"]           = "2.10.0"},
   {["gftl-shared"]     = "1.6.1"},
-  {["mapl"]            = "2.40.3-esmf-8.5.0"},
+  {["mapl"]            = "2.40.3-esmf-8.6.1b04"},
   {["scotch"]          = "7.0.4"},
 }

diff --git a/modulefiles/ufs_hercules.intel.lua b/modulefiles/ufs_hercules.intel.lua
index 605fe579..63cfaa98 100644
--- a/modulefiles/ufs_hercules.intel.lua
+++ b/modulefiles/ufs_hercules.intel.lua
@@ -2,7 +2,7 @@ help([[
 loads UFS Model prerequisites for Hercules/Intel
 ]])

-prepend_path("MODULEPATH", "/work/noaa/epic/role-epic/spack-stack/hercules/spack-stack-1.5.1/envs/unified-env/install/modulefiles/Core")
+prepend_path("MODULEPATH", "/work/noaa/epic/role-epic/spack-stack/hercules/spack-stack-1.6.0/envs/unified-env/install/modulefiles/Core")

 stack_intel_ver=os.getenv("stack_intel_ver") or "2021.9.0"
 load(pathJoin("stack-intel", stack_intel_ver))
jkbk2004 commented 4 weeks ago

@DusanJovic-NOAA can you check /work2/noaa/stmp/jongkim/stmp/jongkim/FV3_RT/rt_3264281/cpld_control_p8_intel ? intel runs ok. Setup is at /work/noaa/epic/jongkim/UFS-RT/hercules/pr-2093/modulefiles. Luckily mapl 2.40.3-esmf-8.5.0 was used ok somehow but mapl should be built with the esmf beta snapshot.

RatkoVasic-NOAA commented 4 weeks ago

@DusanJovic-NOAA there is typo in your ufs_common: should be 8.6.1b04, not 8.6.1bs4

RatkoVasic-NOAA commented 4 weeks ago

Here is my ufs_common:

  {["jasper"]          = "2.0.32"},
  {["zlib"]            = "1.2.13"},
  {["libpng"]          = "1.6.37"},
  {["hdf5"]            = "1.14.0"},
  {["netcdf-c"]        = "4.9.2"},
  {["netcdf-fortran"]  = "4.6.0"},
  {["parallelio"]      = "2.5.10"},
  {["esmf"]            = "8.6.1b04"},
  {["fms"]             = "2023.04"},
  {["bacio"]           = "2.4.1"},
  {["crtm"]            = "2.4.0"},
  {["g2"]              = "3.4.5"},
  {["g2tmpl"]          = "1.10.2"},
  {["ip"]              = "4.3.0"},
  {["sp"]              = "2.5.0"},
  {["w3emc"]           = "2.10.0"},
  {["gftl-shared"]     = "1.6.1"},
  {["mapl"]            = "2.40.3-esmf-8.6.1b04"},
  {["scotch"]          = "7.0.4"},
DusanJovic-NOAA commented 4 weeks ago

Thanks. I fixed the typo and I'm rerunning the test. But correct modules were loaded despite the typo, which is weird.

RatkoVasic-NOAA commented 4 weeks ago

@DusanJovic-NOAA , NOTE , mapl in this configuration is compiled with esmf@8.6.0 (although name suggests differently). This was just test if esmf@8.6.1b04 is working. mapl@2.40.3 didn't compile with new esmf. I'm looking into this with Matt. @jkbk2004 ran tests on Hercules successfully, check with him if he ran same tests as you are running.

DusanJovic-NOAA commented 4 weeks ago

cpld_control_p8 is still crashing.

RatkoVasic-NOAA commented 4 weeks ago

@DusanJovic-NOAA What is difference between your and Jong's run?

DusanJovic-NOAA commented 4 weeks ago

I don't know. My working directory is here: /work/noaa/fv3-cam/djovic/ufs/e861/ufs-weather-model.

Maybe the fact that @jkbk2004 used mapl 2.40.3-esmf-8.5.0? Not mapl 2.40.3-esmf-8.6.1b04

RatkoVasic-NOAA commented 4 weeks ago

You can try with this ufs_common:

  {["jasper"]          = "2.0.32"},
  {["zlib"]            = "1.2.13"},
  {["libpng"]          = "1.6.37"},
  {["hdf5"]            = "1.14.0"},
  {["netcdf-c"]        = "4.9.2"},
  {["netcdf-fortran"]  = "4.6.1"},
  {["parallelio"]      = "2.5.10"},
  {["esmf"]            = "8.6.1b04"},
  {["fms"]             = "2023.04"},
  {["bacio"]           = "2.4.1"},
  {["crtm"]            = "2.4.0"},
  {["g2"]              = "3.4.5"},
  {["g2tmpl"]          = "1.10.2"},
  {["ip"]              = "4.3.0"},
  {["sp"]              = "2.5.0"},
  {["w3emc"]           = "2.10.0"},
  {["gftl-shared"]     = "1.6.1"},
  {["mapl"]            = "2.40.3-esmf-8.5.0"},
  {["scotch"]          = "7.0.4"},

Also:

hercules: /work/noaa/epic/jongkim/UFS-RT/hercules/pr-2093/tests> git remote -v
origin  https://github.com/RatkoVasic-NOAA/ufs-weather-model (fetch)
origin  https://github.com/RatkoVasic-NOAA/ufs-weather-model (push)
hercules: /work/noaa/epic/jongkim/UFS-RT/hercules/pr-2093/tests> git branch
* ss-160
DusanJovic-NOAA commented 4 weeks ago

I can, but that will load esmf-8.5.0 not 8.6.1. And we should test 8.6.1, we know 8.5.0 works fine.

DusanJovic-NOAA commented 4 weeks ago

What's the issue with compiling mapl using esmf 8.6.1?

DusanJovic-NOAA commented 4 weeks ago

Isn't ESMF 8.6.x backward compatible with the previous release, ESMF 8.5.0?

climbfuji commented 4 weeks ago

I think the ESMF target definition has changed, which requires updates to the mapl cmake config (lowercase vs uppercase or something like that).

DusanJovic-NOAA commented 4 weeks ago

If update in the ESMF requires update in the MAPL then we should also update it. We should not be compiling 2.40.3 if it doesn't work with the latest ESMF.

climbfuji commented 4 weeks ago

If update in the ESMF requires update in the MAPL then we should also update it. We should not be compiling 2.40.3 if it doesn't work with the latest ESMF.

We are waiting for that tag (and they were hoping to wait for an official release of esmf as opposed to a beta snapshot ...)

I'll ping our NASA colleagues and ask them to create the tag. Will let you know when I hear back.

DusanJovic-NOAA commented 4 weeks ago

Hmm. MAPL people are waiting on the official ESMF release. And ESMF people are waiting on us to test a beta snapshot before they make a release, and we are waiting on MAPL tag, that is waiting on ESMF. circulus vitiosus.

climbfuji commented 4 weeks ago

Hmm. MAPL people are waiting on the official ESMF release. And ESMF people are waiting on us to test a beta snapshot before they make a release, and we are waiting on MAPL tag, that is waiting on ESMF. circulus vitiosus.

GMAO says the tag will be ready next week

mathomp4 commented 4 weeks ago

I hope I can make a new release of MAPL next week. But because of the requirement for ESMF 8.6.1 (beta or not), I'll need to build new libraries on all our clusters, etc. so that our devs don't have issues building with it (since MAPL will require ESMF 8.6.1).

Now, if you are wanting to test, you could try out this commit https://github.com/GEOS-ESM/MAPL/commit/5f91a5c733eda8cd8d385c108b71b1f41b966c72. This is my current draft PR (see https://github.com/GEOS-ESM/MAPL/pull/2682). This is where I'm tracking the changes to MAPL.


You can also see my spack testing changes here: https://github.com/spack/spack/compare/develop...mathomp4:spack:feature/mathomp4/test-mapl-build

You'll note it says MAPL v5 only because I wanted to make sure I was "safe" when doing the testing. And you'll note the commit for v5 is different since this was a couple weeks ago.

DusanJovic-NOAA commented 3 weeks ago

I ran the control_p8 test with esmf 8.6.1b04 for 240 hours creating outputs every hour. output_fh line for that configuration is longer than 1024 characters, which means version 8.6.1 will fix #1121.

junwang-noaa commented 2 weeks ago

@uturuncoglu @danrosen25 The ESMF 8.6.1b04 testing is done in UFS and the fix works. Thanks.

mathomp4 commented 2 weeks ago

Note: We are close to getting ESMF 8.6.1b04 in GEOS land. I encountered a fun bug with GCC 13 that we had to figure out today, but hopefully next week I can release MAPL 2.46 and all is well.