NOAA-EMC / global-workflow

Global Superstructure/Workflow supporting the Global Forecast System (GFS)
https://global-workflow.readthedocs.io/en/latest
GNU Lesser General Public License v3.0
75 stars 168 forks source link

GFSv16.3.16 - Add WDQMS processing job #2389

Closed aerorahul closed 3 months ago

aerorahul commented 7 months ago

Description

WDQMS (WIGOS Data Quality Monitoring System) is a tool generating data quality reports for conventional and marine data. It was developed by DA team and is being run in real time by Obsproc. Once a day, the code uses the four gdas.YYYYMMDD/HH/atmos/gdas.tHHz.cnvstat files from the prior day to generate 12 reports: NCEP_TEMP_YYYYMMDD_HH.csv NCEP_SYNOP_YYYYMMDD_HH.csv NCEP_MARINE_YYYYMMDD_HH.csv where HH=00,06,12,18

The reports are then staged on emcrzdm ftp for downstream users: https://ftp.emc.ncep.noaa.gov/wdqms/ncep

Code: https://github.com/kevindougherty-noaa/wdqms

Currently, the cron schedule for WDQMS is:

07 14 * * * bash /lfs/h2/emc/obsproc/noscrub/ashley.stanfield/wdqms_new/scripts/WDQMS_new.sh

and a run lasts about 15 minutes on a dev machine.

The operationalization of this tool will prevent reporting gaps and delays during prod switches, wcoss tests, etc.

Requirements Generate 12 monitoring reports per day and stage on emcrzdm ftp, in a prescribed directory

Acceptance Criteria (Definition of Done) The 12 reports are generated and staged by 12 UTC of the next day

Suggest A Solution We propose to include WDMQS as a step in the $pslot.xml file

Subsumes #1459

Target version

v16.3.16

Tasks

KateFriedman-NOAA commented 7 months ago

Added checklist to main issue comment and created release branch (release/gfs.v16.3_wdqms) for this work. Can rename branch once ops version is known. @aerorahul when ready, please submit a PR with changes into the release/gfs.v16.3_wdqms branch, thanks!

KateFriedman-NOAA commented 7 months ago

Have created skeleton release notes, which will need developer inputs to fill out and complete. Have also updated versions/run.ver to use tentative new ops version v16.3.14. Will confirm and update if needed.

KateFriedman-NOAA commented 7 months ago

Merged PR #2396 into release branch. Awaiting follow-up PR from @emilyhcliu with release notes updates. Also awaiting feedback from NCO on adding shared place settings into ecf scripts for shared jobs. Will update release notes with mention of that change if accepted.

KateFriedman-NOAA commented 7 months ago

After hand-off, NCO will add DBN alerts to WDQMS job scripts. Will pull those changes back into global-workflow when made.

KateFriedman-NOAA commented 6 months ago

Have merged PR #2428 into release branch. Awaiting feedback from NCO on version. Need to incorporate removal of ldebug flag into release branch (being removed in ops tomorrow 3/26).

Then will cut hand-off tag. @emilyhcliu were you planning to submit the CDF or shall I? Let me know, thanks!

KateFriedman-NOAA commented 6 months ago

This will tentatively be v16.3.15 now that the ldebug=true update went into ops as v16.3.14 (issue #2431).

KateFriedman-NOAA commented 6 months ago

Have renamed release branch and will update version in release branch before hand-off.

KateFriedman-NOAA commented 6 months ago

GFSv16.3.14 went into operations yesterday as a tiny upgrade to remove a flag from the wave post jobs. I have folded that update into the release/gfs.v16.3.15 (renamed) branch and updated the version to v16.3.15. I have cut a hand-off tag (EMC-v16.3.15).

KateFriedman-NOAA commented 6 months ago

@emilyhcliu The GFSv16.3.15 package is ready for hand-off to NCO (minus needing to fold back in the DBN alert changes from NCO later). I will plan to submit the CDF today. I am going on leave later this afternoon so please let me know by noon if you'd prefer to submit the CDF, thanks! I'll submit it early afternoon if I don't hear any objections.

emilyhcliu commented 6 months ago

@emilyhcliu The GFSv16.3.15 package is ready for hand-off to NCO (minus needing to fold back in the DBN alert changes from NCO later). I will plan to submit the CDF today. I am going on leave later this afternoon so please let me know by noon if you'd prefer to submit the CDF, thanks! I'll submit it early afternoon if I don't hear any objections.

@KateFriedman-NOAA Please ignore the e-mail I just sent you. I did not see the update on this issue in time.
Yes, Please go ahead and submit the CDF. Thanks for your help!

ilianagenkova commented 6 months ago

@KateFriedman-NOAA , I assigned myself to this issue, just so I could "follow". Thanks for pointing me to it!

KateFriedman-NOAA commented 6 months ago

@emilyhcliu All good! Steven Earle finally got back to me on whether we should include the "shared" setting for jobs that don't currently set "shared" or "exclusive". He asked that we wait to include that during a future upgrade...so I will need to remove those edits, recut the hand-off tag, and then I will submit the CDF. I will update you when I've submitted the CDF. Thanks!

KateFriedman-NOAA commented 6 months ago

@emilyhcliu @aerorahul I have submitted the CDF for GFSv16.3.15 (add new WDQMS job). Thanks for your work on this! I will update this issue as it moves through the implementation process.

emilyhcliu commented 6 months ago

@emilyhcliu @aerorahul I have submitted the CDF for GFSv16.3.15 (add new WDQMS job). Thanks for your work on this! I will update this issue as it moves through the implementation process.

@KateFriedman-NOAA Thank you.

KateFriedman-NOAA commented 6 months ago

It was decided to get obsproc.v1.2 into ops before this update and without bundling them. The obsproc update (issue #2291) will be GFSv16.3.15 and this update will (most likely) now be GFSv16.3.16. Will update versions within accordingly.

emilyhcliu commented 6 months ago

It was decided to get obsproc.v1.2 into ops before this update and without bundling them. The obsproc update (issue #2291) will be GFSv16.3.15 and this update will (most likely) now be GFSv16.3.16. Will update versions within accordingly.

Thanks @KateFriedman-NOAA In the meantime, I will work with the WDQMS developer to fix the bug and also ensure it is bulletproof.

KateFriedman-NOAA commented 6 months ago

Thanks @emilyhcliu ! I am going to merge PR #2457 and then we'll have a follow-up PR to adjust the new version number and update the SUB_TYPE when we hear back from NCO.

emilyhcliu commented 6 months ago

Simon from NCO suggested that we should change the WDQMS output filename to following NCO guild lines:

Here is the naming conventions standard for WCOSS2 model - "

B. File Name Conventions
Standard file naming conventions must also be used. File names must not contain special characters,
uppercase characters or the date (the directory in which the file resides will contain the date). File
names must indicate the name of the model run, the cycle, the type of data the file contains, the
resolution of the data (if applicable), other data related elements, the three-digit forecast hour the data
represents (if applicable), and the file type...."

Please refer pages 5-6: ImplementationStandards.v11.0.0.pdf

I will make changes to the output filenames accordingly.

aerorahul commented 6 months ago

Simon from NCO suggested that we should change the WDQMS output filename to following NCO guild lines:

Here is the naming conventions standard for WCOSS2 model - "

B. File Name Conventions
Standard file naming conventions must also be used. File names must not contain special characters,
uppercase characters or the date (the directory in which the file resides will contain the date). File
names must indicate the name of the model run, the cycle, the type of data the file contains, the
resolution of the data (if applicable), other data related elements, the three-digit forecast hour the data
represents (if applicable), and the file type...."

Please refer pages 5-6: ImplementationStandards.v11.0.0.pdf

I will make changes to the output filenames accordingly.

@emilyhcliu NCO is correct. We retained the output filenames coming from the program, but we should absolutely follow the conventions now that we expect them to be part of ops. We will address this prior to code handoff once the issues in the processing are sorted out.

SimonHsiao-NCO commented 6 months ago

Please add the "Script Documentation Block" with script name, description, Author, Abstract, etc in the new scripts/exgdas_atmos_analysis_wdqms.sh and ush/wdqms.py scripts.

ilianagenkova commented 6 months ago

@emilyhcliu and @aerorahul , thanks for working out the code kinks and addressing the file names requirements.

At the moment we push the WDQMS reports to the emcrzdm ftp for ECMWF to pick up. When they get generated in the GFS workflow, how would they be shared with ECMWF ? Via NOMADS? (if so, the files need to be DBnet flagged from within GFS)

aerorahul commented 6 months ago

@ilianagenkova There are 2 options:

  1. alert ECMWF that the files will be at a different location. Specifically somewhere under here
  2. If a DBNAlert is created then they will get notified of it based on how DBN works (I do not know)

The first one is easy and we are waiting for more information on second.

KateFriedman-NOAA commented 6 months ago

Updated tag for hand-off: https://github.com/NOAA-EMC/global-workflow/releases/tag/EMC-v16.3.16

KateFriedman-NOAA commented 5 months ago

Updated hand-off tag after PR #2519 : https://github.com/NOAA-EMC/global-workflow/releases/tag/EMC-v16.3.16

KateFriedman-NOAA commented 5 months ago

@emilyhcliu I merged your PR and recut the EMC-v16.3.16 tag. @SimonHsiao-NCO can now run the git pull origin EMC-v16.3.16 again to pull in your error handling updates.

KateFriedman-NOAA commented 4 months ago

Tag for NCO has been recut: https://github.com/NOAA-EMC/global-workflow/releases/tag/EMC-v16.3.16

Will notify SPA Simon.

KateFriedman-NOAA commented 4 months ago

The RFC/implementation is currently planned for June 26th.

KateFriedman-NOAA commented 4 months ago

SCN link: https://www.weather.gov/media/notification/pdf_2023_24/scn24-65_gfs_v16.3.16.pdf

KateFriedman-NOAA commented 4 months ago

From SPA Simon: "The RFC for this GFS v16.3.16 WDQDMS update will be 6/26 Wed 1400-1900z."

KateFriedman-NOAA commented 3 months ago

From the Friday June 21st RFC memo:

RFC 12812 - On WCOSS2, implement GFS v16.3.16. The GFS v16.3.16 update adds the GDAS WDQMS (WIGOS Data Quality Monitoring System) data which is 6-hourly summaries of observation reports. The objective of this new job is to:

● support the WMO Integrated Global Observing System (WIGOS) WDQMS project ● ensure that WMO observational data and products are reliable and correspond to agreed-upon needs ● 6-hourly summaries include surface (synop), upper-air, and surface marine (ship and buoy) observations

To be implemented on June 26, 1400Z to 1900Z. See the SCN at: https://www.weather.gov/media/notification/pdf_2023_24/scn24-65_gfs_v16.3.16.pdf

KateFriedman-NOAA commented 3 months ago

From NCO:

Due to recovery from CPRK data center outage, this GFS v16.3.16 WDQDMS update RFC
is postponed to next week 7/2 Tuesday 1400-2000z starting from 12z GFS jobs. 
Dataflow will submit the updated SCN today.
KateFriedman-NOAA commented 3 months ago

RFC from last Friday:

RFC 12812 - On WCOSS2, implement GFS v16.3.16. The GFS v16.3.16
update adds the GDAS WDQMS (WIGOS Data Quality Monitoring System) data
which is 6-hourly summaries of observation reports. The objective of this new
job is to:
● support the WMO Integrated Global Observing System (WIGOS)
WDQMS project
● ensure that WMO observational data and products are reliable and
correspond to agreed-upon needs
● 6-hourly summaries include surface (synop), upper-air, and surface
marine (ship and buoy) observations.
Was to be implemented on June 26, 1400Z to 1900Z; postponed due to College
Park hardware issues; the new date is July 2, 1400Z to 2000Z. See the SCN at:
https://www.weather.gov/media/notification/pdf_2023_24/scn24-65_gfs_v16.3.1
6.pdf .
ilianagenkova commented 3 months ago

Will the GFS workflow push the WDQMS reports to emcrzdm like it's done now: /lfs/h2/emc/obsproc/noscrub/ashley.stanfield/wdqms_new/scripts/WDQMS_new.sh" or another mechanism is implemented?

We (@AshleyStanfield-NOAA and myself) need to inform ECMWF if the reports location (on ftp) has changed.

aerorahul commented 3 months ago

@ilianagenkova No. The reports will be available on the NOMADS site.

emilyhcliu commented 3 months ago

Iliana,

ECMWF will be picking up the EMC WDQMS reports from NOMAD. In parallel, we will keep the current ObsProc processing of WDQMS (from Asheley's processing) for a while until Cristina gives us a green light to turn off the WDQMS processing from ObsProc.

Emily

Emily Huichun Liu Physical Scientist Modeling and Data Assimilation Branch - DA & QC NOAA/NWS/NCEP Environmental Modeling Center National Centers for Weather and Climate Prediction (NCWCP) 5830 University Research Court College Park, MD 20740 Phone: 301-683-3639 Cell: 410-948-6256

On Mon, Jul 1, 2024 at 11:17 AM iliana Genkova @.***> wrote:

Will the GFS workflow push the WDQMS reports to emcrzdm like it's done now: /lfs/h2/emc/obsproc/noscrub/ashley.stanfield/wdqms_new/scripts/WDQMS_new.sh" or another mechanism is implemented?

We @.*** https://github.com/AshleyStanfield-NOAA and myself) need to inform ECMWF if the reports location (on ftp) has changed.

— Reply to this email directly, view it on GitHub https://github.com/NOAA-EMC/global-workflow/issues/2389#issuecomment-2200435988, or unsubscribe https://github.com/notifications/unsubscribe-auth/AITLO5RZCSIRKS53L5RYJPLZKFXIFAVCNFSM6AAAAABEQ2QTV6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMBQGQZTKOJYHA . You are receiving this because you were mentioned.Message ID: @.***>

KateFriedman-NOAA commented 3 months ago

SPA Simon confirmed implementation into ops:

Hi All,
The GFS v16.3.16  update adding gdas wdqms data RFC is implemented starting from 12z GFS and please help to 
check the production 12z GFS/GDAS data to ensure the RFC implementation.
Please let me know if you see any issue or any questions.

Thanks,

/Simon
SPA Office
KateFriedman-NOAA commented 3 months ago

Additional update from NCO:

The 12z gdas_atmos_analysis_wdqms job ran complete and the gdas_atmos_analysis_wdqms products are created and posted on the following - https://nomads.ncep.noaa.gov/pub/data/nccf/com/gfs/v16.3/gdas.20240702/12/atmos/wdqms/ https://ftp.ncep.noaa.gov/data/nccf/com/gfs/v16.3/gdas.20240702/12/atmos/wdqms/

FYI @ilianagenkova

KateFriedman-NOAA commented 3 months ago

Tag cut, released, and announced to users: https://github.com/NOAA-EMC/global-workflow/releases/tag/gfs.v16.3.16

ilianagenkova commented 3 months ago

Reports for 20240702 12z generated with Ashley's scripts and with GFS are identical.

aerorahul commented 3 months ago

Can this issue be marked as solved?

ilianagenkova commented 3 months ago

Yes from me.