Closed SamuelTrahanNOAA closed 2 months ago
@SamuelTrahanNOAA When you submit ufs-weather-model PR with upp submodule updating to the latest commit from UPP repository, all UPP control files post-xconfig*.txt (under ufs-weather-model/test/parm) need to be recreated for UFS RTs.
When you submit ufs-weather-model PR with upp submodule updating to the latest commit from UPP repository, all UPP control files post-xconfig*.txt (under ufs-weather-model/test/parm) need to be recreated for UFS RTs.
I need your help with that. I'm uncertain of which txt file in tests/parm corresponds to which file in FV3/upp/parm. Also, I'm wondering if any need modification before copying to tests/parm.
These three files from tests/parm don't exist in FV3/upp/parm:
postxconfig-NT-fv3lam.txt
postxconfig-NT-gfs.txt
postxconfig-NT-gfs_FH00.txt
This does exist, but it's the wrong one:
postxconfig-NT-hafs.txt
The tests/parm version has no synthetic satellite brightness temperature fields, while the FV3/upp/parm one does. It isn't the postxconfig-NT-hafs_nosat.txt
either, since that one differs in other ways.
These three files from tests/parm don't exist in FV3/upp/parm:
postxconfig-NT-fv3lam.txt
postxconfig-NT-gfs.txt
postxconfig-NT-gfs_FH00.txt
This does exist, but it's the wrong one:
postxconfig-NT-hafs.txt
The tests/parm version has no synthetic satellite brightness temperature fields, while the FV3/upp/parm one does. It isn't the
postxconfig-NT-hafs_nosat.txt
either, since that one differs in other ways.
These are simple versions of UPP control files for inline post testing for gfs, rffs and hafs. I will find corresponding xml files and provide you updated txt files later.
@WenMeng-NOAA - As mentioned in this comment, I need help debugging the inline post.
234: get_g2_fixedsurfacetypes key: 255 not found in table 4.5
You can find the output here:
LOG: /scratch1/NCEPDEV/stmp2/Samuel.Trahan/FV3_RT/rt_3037467/gnv1_nested_intel.log RUN: /scratch1/NCEPDEV/stmp2/Samuel.Trahan/FV3_RT/rt_3037467/gnv1_nested_intel
Many ranks give the same error many times, but it's always the same error.
Code is here:
https://github.com/ufs-community/ufs-weather-model/pull/2326
It only works on Hera because only Hera has g2tmpl 1.12.0 in the same Spack Stack as ufs-weather-model prerequisites.
There are a multitude of warnings coming from a library when the latest UPP is used.
148: get_g2_fixedsurfacetypes key: 255 not found in table 4.5
@WenMeng-NOAA has tracked this down to the UPP pull request https://github.com/NOAA-EMC/UPP/pull/929. The warnings vanish if you revert the changes to the postxconfig files (or rather, to the post_avblflds.xml used to generate them). The warnings don't seem to have an effect on the output.
This warning probably originates from a bug in one of the grib2 libraries (g2 or g2tmpl) or an incompatibility between them. Library updates can take a long time.
Question to everyone: Should we wait for the warnings to go away before updating UPP in fv3atm? Or should we live with the warnings for now?
EDIT: We have a bug fix from the UPP side that eliminates these error messages. It is in this PR.
Good news, everyone!
We have a fix for the "255 not found in table 4.5" messages!
That means this PR can continue review without thousands of error messages from g2tmpl.
I'd like @DusanJovic-NOAA @WenMeng-NOAA and @EricJames-NOAA to look at this PR.
I would suggest the following way to include the 3D rhomid
l=lm do j=jsta,jend do i=ista,iend
do l=1,lm tv = t(i,j,l) (h1+d608MAX(q(I,J,L),qmin)) rhomid(i,j,l) = pmid(i,j,l) / (rd*tv)
enddo
dustcb(i,j) = MAX(dustcb(i,j), 0.0)
dustallcb(i,j) = MAX(dustallcb(i,j), 0.0)
@@ -4608,6 +4608,7 @@ module post_fv3 pp25cb(i,j) = MAX(pp25cb(i,j), 0.0) pp10cb(i,j) = MAX(pp10cb(i,j), 0.0)
l=lm
! Surface PM25 dust and seasalt
dustpm(i,j)=(dust(i,j,l,1)+0.38dust(i,j,l,2))rhomid(i,j,l) !ug/m3
dustpm10(i,j)=(dust(i,j,l,1)+dust(i,j,l,2)+dust(i,j,l,3)+ &
@zhanglikate - Your message was garbled, so I don't know what you want.
Please do one of these:
Hi Sam,
Where is the "suggested change” feature? I added comments in the place as “suggested change”. Thanks.
On Jul 23, 2024, at 5:33 PM, Samuel Trahan (NOAA contractor) @.***> wrote:
@zhanglikate https://github.com/zhanglikate - Your message was garbled, so I don't know what you want.
Please do one of these:
Use the github "suggested change" feature, or Open a PR to this branch with your changes. — Reply to this email directly, view it on GitHub https://github.com/NOAA-EMC/fv3atm/pull/845#issuecomment-2246502611, or unsubscribe https://github.com/notifications/unsubscribe-auth/APJPDRBOUWHXP7NA67PYIZDZN3R23AVCNFSM6AAAAABJG45DOSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENBWGUYDENRRGE. You are receiving this because you were mentioned.
@SamuelTrahanNOAA Could you update the upp harsh to the latest commit @81b38a8 from the UPP repository?
Yes. It matches that hash now.
Could I get some reviews by people "in the know?" Maybe these individuals?
The upp submodule update and inline post interface "post_fv3.F90" look good to me.
Github workflow failed:
2024-08-14T18:45:15.8799819Z /home/runner/work/fv3atm/fv3atm/fv3atm/upp/sorc/ncep_post.fd/grib2_module.f:499:53:
2024-08-14T18:45:15.8800893Z
2024-08-14T18:45:15.8801377Z 499 | g2sec4_temp0,g2sec4_temp8,g2sec4_temp9,g2sec4_temp44, &
2024-08-14T18:45:15.8802380Z | 1
2024-08-14T18:45:15.8806022Z Error: Symbol ‘g2sec4_temp9’ referenced at (1) not found in module ‘grib2_all_tables_module’
2024-08-14T18:45:15.8807542Z /home/runner/work/fv3atm/fv3atm/fv3atm/upp/sorc/ncep_post.fd/grib2_module.f:499:80:
2024-08-14T18:45:15.8808321Z
2024-08-14T18:45:15.8808820Z 499 | g2sec4_temp0,g2sec4_temp8,g2sec4_temp9,g2sec4_temp44, &
2024-08-14T18:45:15.8809907Z | 1
2024-08-14T18:45:15.8811126Z Error: Symbol ‘g2sec4_temp46’ referenced at (1) not found in module ‘grib2_all_tables_module’
2024-08-14T18:45:15.8812522Z /home/runner/work/fv3atm/fv3atm/fv3atm/upp/sorc/ncep_post.fd/grib2_module.f:501:79:
2024-08-14T18:45:15.8813317Z
2024-08-14T18:45:15.8813853Z 501 | g2sec5_temp3,g2sec5_temp40,get_g2_sec5packingmethod, &
2024-08-14T18:45:15.8814838Z | 1
2024-08-14T18:45:15.8816070Z Error: Symbol ‘g2sec4_temp49’ referenced at (1) not found in module ‘grib2_all_tables_module’
2024-08-14T18:45:15.8986194Z make[2]: *** [upp/sorc/ncep_post.fd/CMakeFiles/upp.dir/build.make:868: upp/sorc/ncep_post.fd/CMakeFiles/upp.dir/grib2_module.f.o] Error 1
2024-08-14T18:45:15.8989127Z make[1]: *** [CMakeFiles/Makefile2:1278: upp/sorc/ncep_post.fd/CMakeFiles/upp.dir/all] Error 2
2024-08-14T18:45:15.8990581Z make[1]: *** Waiting for unfinished jobs....
That looks a lot like an out-of-date library. Maybe @RatkoVasic-NOAA can comment?
The github workflow gets its UPP dependencies for spack from this line in ci/spack.yaml
:
- upp@develop
The upp@develop
refers to whatever is in spack develop.
In other words, FV3 is getting its UPP dependencies from the spack repository instead of the UPP repository.
Github workflow failed:
2024-08-14T18:45:15.8799819Z /home/runner/work/fv3atm/fv3atm/fv3atm/upp/sorc/ncep_post.fd/grib2_module.f:499:53: 2024-08-14T18:45:15.8800893Z 2024-08-14T18:45:15.8801377Z 499 | g2sec4_temp0,g2sec4_temp8,g2sec4_temp9,g2sec4_temp44, & 2024-08-14T18:45:15.8802380Z | 1 2024-08-14T18:45:15.8806022Z Error: Symbol ‘g2sec4_temp9’ referenced at (1) not found in module ‘grib2_all_tables_module’ 2024-08-14T18:45:15.8807542Z /home/runner/work/fv3atm/fv3atm/fv3atm/upp/sorc/ncep_post.fd/grib2_module.f:499:80: 2024-08-14T18:45:15.8808321Z 2024-08-14T18:45:15.8808820Z 499 | g2sec4_temp0,g2sec4_temp8,g2sec4_temp9,g2sec4_temp44, & 2024-08-14T18:45:15.8809907Z | 1 2024-08-14T18:45:15.8811126Z Error: Symbol ‘g2sec4_temp46’ referenced at (1) not found in module ‘grib2_all_tables_module’ 2024-08-14T18:45:15.8812522Z /home/runner/work/fv3atm/fv3atm/fv3atm/upp/sorc/ncep_post.fd/grib2_module.f:501:79: 2024-08-14T18:45:15.8813317Z 2024-08-14T18:45:15.8813853Z 501 | g2sec5_temp3,g2sec5_temp40,get_g2_sec5packingmethod, & 2024-08-14T18:45:15.8814838Z | 1 2024-08-14T18:45:15.8816070Z Error: Symbol ‘g2sec4_temp49’ referenced at (1) not found in module ‘grib2_all_tables_module’ 2024-08-14T18:45:15.8986194Z make[2]: *** [upp/sorc/ncep_post.fd/CMakeFiles/upp.dir/build.make:868: upp/sorc/ncep_post.fd/CMakeFiles/upp.dir/grib2_module.f.o] Error 1 2024-08-14T18:45:15.8989127Z make[1]: *** [CMakeFiles/Makefile2:1278: upp/sorc/ncep_post.fd/CMakeFiles/upp.dir/all] Error 2 2024-08-14T18:45:15.8990581Z make[1]: *** Waiting for unfinished jobs....
@DusanJovic-NOAA Please check if g2/3.51 and g2tmpl/1.13.0 are used in CI.
It's using a cached version of spack, not the latest build.
There should be an update to .github/workflows/GCC.yml
to increment the cache version.
It's using a cached version of spack, not the latest build.
There should be an update to
.github/workflows/GCC.yml
to increment the cache version.
Actually I think we should ci/spack.yaml, because the cache key is defined based on hash of spack.yaml. So in order to detect potential changes in any of the libraries we should explicitly list every single library and its version , and not use 'upp@develop' which basically 'hides' the changes in upp dependencies. upp@develop can mean different things at different time. Basically all dependencies must be versioned. And instead of just saying upp, I prefer to have a explicit list of libraries with explicit versions.
Try to change spack.yaml in your branch and see if it triggers spack cache rebuild.
I've updated spack.yaml, and it is rebuilding. There were library version conflicts, so I chose the latest version of the two.
What worries me is this one:
ip@develop precision=4,d,8
Using the @develop
version means the version may change as the branch is retested. Wouldn't we be better off using a specific numbered version? If so, which one?
I've updated spack.yaml, and it is rebuilding. There were library version conflicts, so I chose the latest version of the two.
What worries me is this one:
ip@develop precision=4,d,8
Using the
@develop
version means the version may change as the branch is retested. Wouldn't we be better off using a specific numbered version? If so, which one?
Yes. All libraries should be versioned, develop is ambiguous. I would use versions from your ufs-weather-model branch ufs_common.lua from modulefiles directory, with updated versions of g2 and g2tmpl. Also, fv3atm does not need sigio, sfcio, nemsio and wrf-io. Those are required by the upp executable, which we are not building here.
Yes. All libraries should be versioned, develop is ambiguous. I would use versions from your ufs-weather-model branch ufs_common.lua from modulefiles directory, with updated versions of g2 and g2tmpl. Also, fv3atm does not need sigio, sfcio, nemsio and wrf-io. Those are required by the upp executable, which we are not building here.
@DusanJovic-NOAA - Could you check the spack.yaml? It should match your specifications now.
Yes. All libraries should be versioned, develop is ambiguous. I would use versions from your ufs-weather-model branch ufs_common.lua from modulefiles directory, with updated versions of g2 and g2tmpl. Also, fv3atm does not need sigio, sfcio, nemsio and wrf-io. Those are required by the upp executable, which we are not building here.
@DusanJovic-NOAA - Could you check the spack.yaml? It should match your specifications now.
The spack.yaml looks good. And I see workflow finished successfully. Thank you for fixing this.
I made an issue for the spack.yaml troubles:
It'll be automatically closed when this PR is merged.
@jkbk2004 testing is finished on WM PR 2326. Feel free to merge this PR.
Description
This updates the upp submodule to the head of NOAA-EMC UPP's develop branch.
Also, the spack.yaml has explicit UPP dependencies (g2, g2tmpl, etc.) and all libraries have exact version numbers. These version numbers match the top of ufs-weather-model's develop branch for hercules.
Issue(s) addressed
Testing
How were these changes tested?
Tested in the HAFS-AR variant of the HAFS workflow in a large stationary regional domain over the north-east Pacific. Values looked reasonable. Also ran UPP and ufs-weather-model regression tests.
What compilers / HPCs was it tested with?
Hera intel and Hera gnu.
Are the changes covered by regression tests? (If not, why? Do new tests need to be added?)
The ufs-weather-model regression tests already test the inline post. However, I added three new number concentration fields to the ufs-weather-model gnv1_nested regression test.
Have the ufs-weather-model regression test been run? On what platform?
The ufs-weather-model regression testing is in progress now. Baseline generation on Hera succeeded.
UPP's own regression tests passed. That tests the offline post (running UPP outside FV3).
Will the code updates change regression test baseline? If yes, why? Please show the baseline directory below.
Results of many tests should change due to updating the UPP hash.
Please commit the regression test log files in your ufs-weather-model branch
Will do.
Dependencies
None.