Closed KateFriedman-NOAA closed 7 months ago
Create release branch (will update name when version is determined): https://github.com/NOAA-EMC/global-workflow/tree/release/gfs.v16.3.TBD_WAFS
At GFS v16.3, Yali updated WAFS package to be ready to produce high resolution WAFS output to meet 2023 ICAO milestone with one switch. The switch was turned off at GFS v16.3. Now that UKMO has started to produce their corresponding high resolution WAFS output, EMC will work with NCO to turn on the switch.
Additionally, due to 5-10 min delay in receiving UKMO high resolution data to blend with US data, EMC will increase the wait time for UKMO data by 5 min. Both sides agreed that we sill stop waiting at T+4:45.
Finally, NESDIS informed EMC that they will implement new global satellite composite data in Jan 2024. This data set is used by one of WAFS packages. NESDIS provided a sample data set which Yali used to develop a table that works with current ops and to-be-implemented satellite data. This table will be implemented too.
@HuiyaChuang-NOAA @YaliMao-NOAA I have added a skeleton release notes document into the release branch and tentatively updated the GFS version #s in run.ver
to v16.3.11. We'll see what version NCO assigns this upgrade and can update further as needed.
Please make a PR into the release/gfs.v16.3.TBD_WAFS
branch with further updates for release note text, workflow changes, version number (if you learn it's different), WAFS tag, etc. I can be a reviewer and merge it when approved. Thanks!
@KateFriedman-NOAA Thank you for the instructions. It seems the only change for this PR is to modify the wafs tag in sorc/checkout.sh, right?
@KateFriedman-NOAA Thank you for the instructions. It seems the only change for this PR is to modify the wafs tag in sorc/checkout.sh, right?
checkout.sh
Externals.cfg
WAFS tag (https://github.com/NOAA-EMC/global-workflow/blob/dev/gfs.v16/Externals.cfg#L46)Is this update related to the ICAO2023=no
variable we added a year ago? Example:
If so, would these values need to change to YES
now?
Thank you for the instructions.
Is this update related to the
ICAO2023=no
variable we added a year ago? Example:If so, would these values need to change to
YES
now?
Yes, set ICAO2023=yes, not capital 'YES'. The related ecf files are: jgfs_atmos_wafs_blending_0p25.ecf jgfs_atmos_wafs_grib2_0p25.ecf jgfs_atmos_wafs_grib2.ecf:export jgfs_atmos_wafs_gcip.ecf
Meanwhile, we are going to stop jgfs_atmos_wafs_blending.ecf.
Yes, set ICAO2023=yes, not capital 'YES'. The related ecf files are: jgfs_atmos_wafs_blending_0p25.ecf jgfs_atmos_wafs_grib2_0p25.ecf jgfs_atmos_wafs_grib2.ecf:export jgfs_atmos_wafs_gcip.ecf
Thanks for confirming @YaliMao-NOAA !
Meanwhile, we are going to stop jgfs_atmos_wafs_blending.ecf.
Ok, good to know. Do you plan to remove the scripts associated with this job then? Should we remove the job from the rocoto job mesh as well? Additional changes to the rocoto setup scripts would be needed if we're completely retiring the wafs_blending job. Let me know, I can guide you on making changes to remove the job from the system. Thanks!
Yes, we are going to remove the scripts associated with wafs_blending.
May I know EIB or NCO is in charge of trigger time of a job? wafs_blending_0p25 is approved by NCO to start 5 minutes later than the current operational one.
May I know EIB or NCO is in charge of trigger time of a job? wafs_blending_0p25 is approved by NCO to start 5 minutes later than the current operational one.
NCO generally updates the prod def file for ecflow and we (EMC) ingests the change into our repo. If we know what the changes are going to be we can make them on our end now in ecf/defs/gfs_v16_3.def
. See the current copy: https://github.com/NOAA-EMC/global-workflow/blob/dev/gfs.v16/ecf/defs/gfs_v16_3.def
Yes, we are going to remove the scripts associated with wafs_blending.
Okie dokie...please also include changes in your PR into the release branch to remove this job and its associated files/settings. That includes the following:
jobs/rocoto/wafsblending.sh
parm/config/config.wafsblending
gfs/atmos/post_processing/grib2_wafs/jgfs_atmos_wafs_blending.ecf
wafsblending
job in the rocoto setup scripts (ush/rocoto/setup_workflow.py
& ush/rocoto/setup_workflow_fcstonly.py
)wafsblending
job in the two parm/config/config.resources*
config files.ecf/defs/gfs_v16_3.def
).(I think that list is complete.)
Removals should be done with a git rm
command in your branch. Let me know if you have any questions on this.
Oh, also, all of the files being removed/modified while removing the WAFS blending job should also be mentioned in the release notes. Thanks! :)
I modified the prior comment to add to the list.
May I know EIB or NCO is in charge of trigger time of a job? wafs_blending_0p25 is approved by NCO to start 5 minutes later than the current operational one.
NCO generally updates the prod def file for ecflow and we (EMC) ingests the change into our repo. If we know what the changes are going to be we can make them on our end now in
ecf/defs/gfs_v16_3.def
. See the current copy: https://github.com/NOAA-EMC/global-workflow/blob/dev/gfs.v16/ecf/defs/gfs_v16_3.defYes, we are going to remove the scripts associated with wafs_blending.
Okie dokie...please also include changes in your PR into the release branch to remove this job and its associated files/settings. That includes the following:
- Remove
jobs/rocoto/wafsblending.sh
- Remove
parm/config/config.wafsblending
- Remove
gfs/atmos/post_processing/grib2_wafs/jgfs_atmos_wafs_blending.ecf
- Remove references to the
wafsblending
job in the rocoto setup scripts (ush/rocoto/setup_workflow.py
&ush/rocoto/setup_workflow_fcstonly.py
)- Remove reference to the
wafsblending
job in the twoparm/config/config.resources*
config files.- Remove references to the WAFS blending job in the ecf definition file (
ecf/defs/gfs_v16_3.def
).(I think that list is complete.)
Removals should be done with a
git rm
command in your branch. Let me know if you have any questions on this.
@KateFriedman-NOAA Yali and I just talked and since we've told NCO this will be minor updates, Yali will not remove these scripts. Instead, she will add "exit" before execution. Removing these scripts will most likely raise alarms with NCO and delay this implementation. This implementation is part of US' treaties with UKMO and United Nation. Yali will remove these scripts when she separates WAFS from GFS next year.
Yali and I just talked and since we've told NCO this will be minor updates, Yali will not remove these scripts. Instead, she will add "exit" before execution. Removing these scripts will most likely raise alarms with NCO and delay this implementation. This implementation is part of US' treaties with UKMO and United Nation.
Okie dokie. If NCO is ok with the WAFS blending file still submitting but exiting immediately then I have no objections. However...more cleanly, NCO should just remove the job from the ecflow suite and ecf definition file (ecf/defs/gfs_v16_3.def
) so it doesn't submit at all. Will let you and NCO decide what is best on that topic. Just let me know the final decision and what changes they make to the def file so we can fold that into our repo. Thanks!
Yali will remove these scripts when she separates WAFS from GFS next year.
Sounds good!
Yes, set ICAO2023=yes, not capital 'YES'. The related ecf files are: jgfs_atmos_wafs_blending_0p25.ecf jgfs_atmos_wafs_grib2_0p25.ecf jgfs_atmos_wafs_grib2.ecf:export jgfs_atmos_wafs_gcip.ecf
The update to the ICAO2023
variable should still be made to the ecf scripts, mentioned in the release notes, and included in the PR to the release branch. We'll leave the other script updates out. Thanks!
NCO should just remove the job from the ecflow suite and ecf definition file (
ecf/defs/gfs_v16_3.def
) so it doesn't submit at all.
This is nicer and neater. Also I just realize the trigger time is controlled by ecf/defs/gfs_v16_3.def, so the trigger time should be updated here for task jgfs_atmos_wafs_blending_0p25, from 4:25 to 4:30
Yali and I just talked and since we've told NCO this will be minor updates, Yali will not remove these scripts. Instead, she will add "exit" before execution. Removing these scripts will most likely raise alarms with NCO and delay this implementation. This implementation is part of US' treaties with UKMO and United Nation.
Okie dokie. If NCO is ok with the WAFS blending file still submitting but exiting immediately then I have no objections. However...more cleanly, NCO should just remove the job from the ecflow suite and ecf definition file (
ecf/defs/gfs_v16_3.def
) so it doesn't submit at all. Will let you and NCO decide what is best on that topic. Just let me know the final decision and what changes they make to the def file so we can fold that into our repo. Thanks!Yali will remove these scripts when she separates WAFS from GFS next year.
Sounds good!
Yes, set ICAO2023=yes, not capital 'YES'. The related ecf files are: jgfs_atmos_wafs_blending_0p25.ecf jgfs_atmos_wafs_grib2_0p25.ecf jgfs_atmos_wafs_grib2.ecf:export jgfs_atmos_wafs_gcip.ecf
The update to the
ICAO2023
variable should still be made to the ecf scripts, mentioned in the release notes, and included in the PR to the release branch. We'll leave the other script updates out. Thanks!
@KateFriedman-NOAA thank you for suggesting to update ecf file! Great idea!
Sure thing! I updated the checklist in the main comment of the issue to reflect the to-do list. Feel free to add/remove as needed. Wanted to make sure we had all the things to do written somewhere. :)
Additionally, due to 5-10 min delay in receiving UKMO high resolution data to blend with US data, EMC will increase the wait time for UKMO data by 5 min. Both sides agreed that we sill stop waiting at T+4:45.
@HuiyaChuang-NOAA As the latest update from Steven, the 5 min will apply to trigger time, instead of waiting time window.
Additionally, due to 5-10 min delay in receiving UKMO high resolution data to blend with US data, EMC will increase the wait time for UKMO data by 5 min. Both sides agreed that we sill stop waiting at T+4:45.
@HuiyaChuang-NOAA As the latest update from Steven, the 5 min will apply to trigger time, instead of waiting time window.
yes.
Notes from WAFS tag-up today:
@KateFriedman-NOAA Yali reverted her change to satellite config file in the same PR. She also edited release note to remove mention about making updates to work with satellite data.
Will the next step for you to merge Yali's PR?
Sounds like you're good to go with the release branch changes in Yali's PR so I will approve and merge. I will also go ahead and cut a hand-off tag. Let us know if there need to be further changes. Thanks!
Have cut hand-off tag https://github.com/NOAA-EMC/global-workflow/releases/tag/EMC-v16.3.11.
NCO wants to merge this with the GSI update --> GFSv16.3.11
New tag for CDF has been cut: https://github.com/NOAA-EMC/global-workflow/releases/tag/EMC-v16.3.11
Recut EMC-v16.3.11 tag after updates from NCO in PR #2045.
Recut EMC-v16.3.11 tag after release notes updates from @YaliMao-NOAA in PR https://github.com/NOAA-EMC/global-workflow/pull/2050.
From NCO SPA:
With several issues of WAFS, we are now planning to delay the gfs.v16.3.11 implementation to the week of 12/5.
WAFS update has been removed from the GFSv16.3.11 update package and will occur later (~January 2024 - TBD).
Will redo WAFS changes in new release branch. See https://github.com/NOAA-EMC/global-workflow/commit/fccd9c8abff4e1eeb9ec43d4d7c9d9712f23f9e7 for what was removed from GFSv16.3.11 release branch and will need to be done again later.
The new WAFS rescheduled date is Jan 17.
Confirmed by @YaliMao-NOAA https://github.com/NOAA-EMC/global-workflow/issues/1356#issuecomment-1838931055
@KateFriedman-NOAA May I know what's the next branch for WAFS? Thank you.
@YaliMao-NOAA I hadn't yet created one but have now done so: release/gfs.v16.3.13
It cut it from the release/gfs.v16.3.12
branch so it's up-to-date with what will go into ops shortly.
Hand-off tag has been cut: EMC-v16.3.13
SCN notification for this upgrade:
SCN23-111: Change to Global Aviation Products related to the World Area Forecast System (WAFS) Product on or
about January 17, 2024
A Text file of the SCN listed above was sent today on AWIPS/NOAAPORT/NWWS/EMWIN.
A pdf version is posted at:
https://www.weather.gov/media/notification/pdf_2023_24/scn23-111_wafs_products_change.pdf
Monica Parker
National Notification Coordinator
National Weather Service
SCN notification for this upgrade:
SCN23-111: Change to Global Aviation Products related to the World Area Forecast System (WAFS) Product on or about January 17, 2024 A Text file of the SCN listed above was sent today on AWIPS/NOAAPORT/NWWS/EMWIN. A pdf version is posted at: https://www.weather.gov/media/notification/pdf_2023_24/scn23-111_wafs_products_change.pdf Monica Parker National Notification Coordinator National Weather Service
Thank you! @KateFriedman-NOAA
@KateFriedman-NOAA @XianwuXue-NOAA Should the SCN be included in the release note?
@YaliMao-NOAA They are normally two separate things but if NCO would like the link to the SCN in the release notes we can certainly add it. :)
@KateFriedman-NOAA I updated the release note with the SCN link, also I updated the HPSS archive file size increasement.
@HuiyaChuang-NOAA
Merged PR #2164 and cut updated EMC-v16.3.13 tag.
Merged PR https://github.com/NOAA-EMC/global-workflow/pull/2200 and cut updated EMC-v16.3.13 tag.
Implementation moved to January 24th.
From the January 19th RFC memo:
RFC 12134 - On WCOSS2, upgrade the GFS to v16.3.13. With this update,
NCEP, in coordination with UKMO (UK Met Office), is updating the GFS to
produce high resolution WAFS output and stop producing blended 1.25 deg
WAFS files in order to meet a 2023 ICAO (International Civil Aviation
Organization) milestone. To be implemented on January 24, 1430Z to 1830Z.
@KateFriedman-NOAA This release and also v16.3.12 will not build on any machine apart from WCOSS2. This is because upp_ver
has changed from 8.2.0
to 8.3.0
which is not found (for example on Hera) in fv3gfs.fd/NEMS/src/conf/modules.nems.lua. What do we need to do to add this to the non-production machines?
Alternatively, could we just point to 8.2.0
on the dev machines as the results in UPP should be identical.
@ADCollard I already don't support GFSv16.X.Y outside of WCOSS2 as of a few versions ago and the ongoing Rocky8 update on the R&Ds will remove the hpc-stack installs used by the v16 package...so support for all versions of GFSv16 will be dropped now outside of WCOSS2. If the Rocky8 update weren't happening I would definitely consider changing upp_ver
as you suggest.
@KateFriedman-NOAA Just so I understand this correctly, the Rocky8 update will make running the operational workflow on development machines impossible? What is the expected timeframe for this?
the Rocky8 update will make running the operational workflow on development machines impossible?
Based on my understanding that's correct. It's mainly about the library stacks (hpc-stack) that the v16 system uses on the R&Ds. The hpc-stack installs are already no longer supported and there aren't resources to update v16 with spack-stack. It's likely that those hpc-stacks won't work after the transition. Also, the fact that crtm/v2.4.0.1
was installed under spack-stack and not hpc-stack is why I stopped supporting GFSv16 on the R&Ds when we went to the newer crtm version.
What is the expected timeframe for this?
The Rocky8 transition on Hera (and Jet I think) will be done by April and is on-going (130 nodes on Hera are Rocky8 now and 2/3rds of Hera will be converted next month).
Thanks @KateFriedman-NOAA I do not think this is widely known. I will communicate this to the GSI team.
@ADCollard FYI, I am going to try to see if it still builds/runs/etc. with the upp version change but I'm expecting things to be broken. I've already gotten a user report that gerrit checkout of GSI fix from VLab is broken on the Rocky8 nodes.
Hera test
Conduct the following test
hfe09
. This is a Rocky 8 head node. dev/gfs.v16
in /scratch1/NCEPDEV/da/Russ.Treadon/git/global-workflow/dev_gfsv16
All builds failed
Hera(hfe09):/scratch1/NCEPDEV/da/Russ.Treadon/git/global-workflow/dev_gfsv16/sorc$ ./build_all.sh
Creating logs folder
Creating ../exec folder
.... Building fv3 ....
Fatal error in building fv3.
The log file is in /scratch1/NCEPDEV/da/Russ.Treadon/git/global-workflow/dev_gfsv16/sorc/logs/build_fv3.log
.... Building gsi ....
Fatal error in building gsi.
The log file is in /scratch1/NCEPDEV/da/Russ.Treadon/git/global-workflow/dev_gfsv16/sorc/logs/build_gsi.log
.... Building ncep_post ....
Fatal error in building ncep_post.
The log file is in /scratch1/NCEPDEV/da/Russ.Treadon/git/global-workflow/dev_gfsv16/sorc/logs/build_ncep_post.log
.... Building ufs_utils ....
Fatal error in building ufs_utils.
The log file is in /scratch1/NCEPDEV/da/Russ.Treadon/git/global-workflow/dev_gfsv16/sorc/logs/build_ufs_utils.log
.... Building gldas ....
Fatal error in building gldas.
The log file is in /scratch1/NCEPDEV/da/Russ.Treadon/git/global-workflow/dev_gfsv16/sorc/logs/build_gldas.log
.... Building gaussian_sfcanl ....
Fatal error in building gaussian_sfcanl.
The log file is in /scratch1/NCEPDEV/da/Russ.Treadon/git/global-workflow/dev_gfsv16/sorc/logs/build_gaussian_sfcanl.log
.... Building enkf_chgres_recenter ....
Fatal error in building enkf_chgres_recenter.
The log file is in /scratch1/NCEPDEV/da/Russ.Treadon/git/global-workflow/dev_gfsv16/sorc/logs/build_enkf_chgres_recenter.log
.... Building enkf_chgres_recenter_nc ....
Fatal error in building enkf_chgres_recenter_nc.
The log file is in /scratch1/NCEPDEV/da/Russ.Treadon/git/global-workflow/dev_gfsv16/sorc/logs/build_enkf_chgres_recenter_nc.log
.... Building tropcy_NEMS ....
Fatal error in building tropcy_NEMS.
The log file is in /scratch1/NCEPDEV/da/Russ.Treadon/git/global-workflow/dev_gfsv16/sorc/logs/build_tropcy_NEMS.log
.... Building gfs_fbwndgfs ....
Fatal error in building gfs_fbwndgfs.
The log file is in /scratch1/NCEPDEV/da/Russ.Treadon/git/global-workflow/dev_gfsv16/sorc/logs/build_gfs_fbwndgfs.log
.... Building gfs_bufrsnd ....
Fatal error in building gfs_bufrsnd.
The log file is in /scratch1/NCEPDEV/da/Russ.Treadon/git/global-workflow/dev_gfsv16/sorc/logs/build_gfs_bufrsnd.log
.... Building fv3nc2nemsio ....
Fatal error in building fv3nc2nemsio.
The log file is in /scratch1/NCEPDEV/da/Russ.Treadon/git/global-workflow/dev_gfsv16/sorc/logs/build_fv3nc2nemsio.log
.... Building regrid_nemsio ....
Fatal error in building regrid_nemsio.
The log file is in /scratch1/NCEPDEV/da/Russ.Treadon/git/global-workflow/dev_gfsv16/sorc/logs/build_regrid_nemsio.log
FATAL BUILD ERROR: Please check the log file for detail, ABORT!
I looked in sorc/logs/build_gsi.log
Lmod has detected the following error: The following module(s) are unknown: "intel/18.0.5.274"
Please check the spelling or version number. Also try "module spider ..."
It is also possible your cache file is out-of-date; it may help to try:
$ module --ignore_cache load "intel/18.0.5.274"
Also make sure that all modulefiles written in TCL start with the string #%Module
Executing this command requires loading "intel/18.0.5.274" which failed while processing the following module(s):
Module fullname Module Filename
--------------- ---------------
hpc-intel/18.0.5.274 /scratch2/NCEPDEV/nwprod/hpc-stack/libs/hpc-stack-gfsv16/modulefiles/core/hpc-intel/18.0.5.274.lua
gsi_hera.intel /scratch1/NCEPDEV/da/Russ.Treadon/git/global-workflow/dev_gfsv16/sorc/gsi.fd/modulefiles/gsi_hera.intel.lua
The operational gsi build uses intel/18.0.5.274
. module -r spider '.*intel.*'
returns
-----------------------------------------------------------------------------------------------------------------------------------------------
intel:
-----------------------------------------------------------------------------------------------------------------------------------------------
Versions:
intel/default
intel/2022.1.2
intel/2023.2.0
Looks like we may not be able to access intel/18.0.5.274
on Hera Rocky 8 nodes. Then again, maybe we can access the intel/18 compiler if we use the correct module use _path_to_intel18_
. Is this possible? I don't know.
@RussTreadon-NOAA
There will be no Intel 18 on the Rocky 8 nodes on Hera.
Moving to newer Intel compilers for dev/gfs.v16
is most likely impacted.
@RussTreadon-NOAA Thanks for trying the build of dev/gfs.v16
on the Hera Rocky8 login nodes. I'm remembering now that, I think it was @DavidHuber-NOAA, mentioned in a meeting this morning that we wouldn't have intel 2018 available to us. I don't know if it'll be possible to point to the other installs or if those installs will survive the transition. Jet and Orion will be having similar transitions. Hercules is already Rocky9 so we'll never support v16 there.
Got it. Seems any work for GFS v16 implementations will need to be done on WCOSS2 in the not so distant future.
Description
Updated WAFS package in operations. Details from @HuiyaChuang-NOAA (as seen in comment below):
Target version
v16.3.13
Expected workflow changes
ICAO2023=yes
Tasks