NOAA-EMC / global-workflow

Global Superstructure/Workflow supporting the Global Forecast System (GFS)
https://global-workflow.readthedocs.io/en/latest
GNU Lesser General Public License v3.0
74 stars 167 forks source link

Archiving cleanup #2621

Closed DavidHuber-NOAA closed 4 months ago

DavidHuber-NOAA commented 4 months ago

Description

1) Adds a lot of comments to the jinja templates for archiving 2) Rearranges the gdas and enkf templates to a more logical order 3) Fixes a couple of bugs in the enkf archiving of increments and analyses 4) Disables archiving for the half cycle 5) Removes the FITSARC key from config.base and arcdir.yaml.j2, instead relying on DO_FIT2OBS 6) Updates wxflow to add the option to not allow undefined variables when parsing jinja templates and invokes this feature when running archives

Resolves #2612

Type of change

Change characteristics

How has this been tested?

Cycled test on Hera

Checklist

AndrewEichmann-NOAA commented 4 months ago

A test run with hash 7d2c539f45194cd4e5b21bfd4b83a9480189cd0f (May 21) had arch fail on the first half-cycle with:

  File "/scratch1/NCEPDEV/da/Andrew.Eichmann/fv3gfs/develop/global-workflow/ush/python/pygfs/task/archive.py", line 199, in _create_fileset
    raise FileNotFoundError(f"FATAL ERROR: Required file, directory, or glob {item} not found!")
FileNotFoundError: FATAL ERROR: Required file, directory, or glob gdas.20210630/00/model_data/ocean/history/gdas.ocean.t00z.inst.f000.nc not found!

In the two subsequent cycles arch fails with the following:

  File "/scratch1/NCEPDEV/da/Andrew.Eichmann/fv3gfs/develop/global-workflow/ush/python/pygfs/task/archive.py", line 199, in _create_fileset
    raise FileNotFoundError(f"FATAL ERROR: Required file, directory, or glob {item} not found!")
FileNotFoundError: FATAL ERROR: Required file, directory, or glob gdas.20210630/06/analysis/ocean/gdas.t06z.ocn.adt_rads_all.stats.csv not found!

@guillaumevernieres tells me that the csv file in question is produced only on 00 cycles.

guillaumevernieres commented 4 months ago

@DavidHuber-NOAA , there's a bug in the archiving of the csv files. applying this patch should fix the issue:

diff --git a/parm/archive/gdasocean_analysis.yaml.j2 b/parm/archive/gdasocean_analysis.yaml.j2
index 0c43cd40..042f434e 100644
--- a/parm/archive/gdasocean_analysis.yaml.j2
+++ b/parm/archive/gdasocean_analysis.yaml.j2
@@ -9,7 +9,7 @@ gdasocean_analysis:
         {% for domain in ["ocn", "ice"] %}
         - '{{ COM_OCEAN_ANALYSIS | relpath(ROTDIR) }}/{{ head }}{{domain}}.bkgerr_stddev.nc'
         - '{{ COM_OCEAN_ANALYSIS | relpath(ROTDIR) }}/{{ head }}{{domain}}.incr.nc'
-        - '{{ COM_OCEAN_ANALYSIS | relpath(ROTDIR) }}/{{ head }}{{domain}}ana.nc'
+<        - '{{ COM_OCEAN_ANALYSIS | relpath(ROTDIR) }}/{{ head }}{{domain}}ana.nc'
         {% if NMEM_ENS > 2 %}
         - '{{ COM_OCEAN_ANALYSIS | relpath(ROTDIR) }}/{{ head }}{{domain}}.recentering_error.nc'
         {% endif %}
@@ -20,8 +20,6 @@ gdasocean_analysis:
         - '{{ COM_OCEAN_ANALYSIS | relpath(ROTDIR) }}/{{ head }}ocn.ssh_total_stddev.nc'
         - '{{ COM_OCEAN_ANALYSIS | relpath(ROTDIR) }}/{{ head }}ocn.steric_explained_variance.nc'
         {% endif %}
-        - '{{ COM_OCEAN_ANALYSIS | relpath(ROTDIR) }}/{{ head }}ocn.adt_rads_all.stats.csv'
-        - '{{ COM_OCEAN_ANALYSIS | relpath(ROTDIR) }}/{{ head }}ocn.icec_amsr2_north.stats.csv'
-        - '{{ COM_OCEAN_ANALYSIS | relpath(ROTDIR) }}/{{ head }}ocn.icec_amsr2_south.stats.csv'
+        - '{{ COM_OCEAN_ANALYSIS | relpath(ROTDIR) }}/*.stats.csv'
         - '{{ COM_OCEAN_ANALYSIS | relpath(ROTDIR) }}/diags/*.nc4'
         - '{{ COM_OCEAN_ANALYSIS | relpath(ROTDIR) }}/yaml/*.yaml'
DavidHuber-NOAA commented 4 months ago

Thanks for the heads up @AndrewEichmann-NOAA @guillaumevernieres. This PR will disable archiving for the first half cycle, so that should resolve the missing gdas.ocean.t00z.inst.f000.nc error.

For the csv files, I would like to test out the suggested change. Could you share your experiment setup so I can run a test case?

guillaumevernieres commented 4 months ago

Thanks for the heads up @AndrewEichmann-NOAA @guillaumevernieres. This PR will disable archiving for the first half cycle, so that should resolve the missing gdas.ocean.t00z.inst.f000.nc error.

For the csv files, I would like to test out the suggested change. Could you share your experiment setup so I can run a test case?

No problem and thanks for doing this work. This was discovered while running the c384/0.25, but that could be tested at lower res with the existing test, you'll just need to extend the date since it only does 1.5 cycle:

https://github.com/NOAA-EMC/global-workflow/blob/7d2c539f45194cd4e5b21bfd4b83a9480189cd0f/ci/cases/pr/C48mx500_3DVarAOWCDA.yaml#L14

guillaumevernieres commented 4 months ago

Oops ... 1 more thing @DavidHuber-NOAA , turns out we don't have enough marine obs in glopara to do enough cycles to trigger the archiving. You'll have to update DMPDIR: https://github.com/NOAA-EMC/global-workflow/blob/7d2c539f45194cd4e5b21bfd4b83a9480189cd0f/parm/config/gfs/yaml/defaults.yaml#L55

to

DMPDIR: /scratch1/NCEPDEV/da/common/
DavidHuber-NOAA commented 4 months ago

OK, thanks for the heads up @guillaumevernieres.

DavidHuber-NOAA commented 4 months ago

@guillaumevernieres Alright, I have a fix in for the csv files.

emcbot commented 4 months ago

CI Update on Wcoss2 at 05/31/24 01:32:12 PM
============================================
Cloning and Building global-workflow PR: 2621
with PID: 46357 on host: clogin01
emcbot commented 4 months ago

Automated global-workflow Testing Results:


Machine: Wcoss2
Start: Fri May 31 13:40:46 UTC 2024 on clogin01
---------------------------------------------------
Build: Completed at 05/31/24 01:52:25 PM
Case setup: Completed for experiment C48_ATM_98051ebd
Case setup: Skipped for experiment C48mx500_3DVarAOWCDA_98051ebd
Case setup: Skipped for experiment C48_S2SWA_gefs_98051ebd
Case setup: Completed for experiment C48_S2SW_98051ebd
Case setup: Completed for experiment C96_atm3DVar_extended_98051ebd
Case setup: Skipped for experiment C96_atm3DVar_98051ebd
Case setup: Skipped for experiment C96_atmaerosnowDA_98051ebd
Case setup: Completed for experiment C96C48_hybatmDA_98051ebd
Case setup: Skipped for experiment C96C48_ufs_hybatmDA_98051ebd
emcbot commented 4 months ago

Experiment C48_ATM_98051ebd SUCCESS on Wcoss2 at 05/31/24 03:03:37 PM

emcbot commented 4 months ago

Experiment C48_S2SW_98051ebd SUCCESS on Wcoss2 at 05/31/24 03:51:13 PM

emcbot commented 4 months ago

Experiment C96C48_hybatmDA_98051ebd SUCCESS on Wcoss2 at 05/31/24 04:36:28 PM

DavidHuber-NOAA commented 4 months ago

@TerrenceMcGuinness-NOAA It looks like the builds failed on Hera. Could you reboot CI on that platform?

DavidHuber-NOAA commented 4 months ago

@TerrenceMcGuinness-NOAA Disregard, I see now that CI was restarted on Hera already.

emcbot commented 4 months ago

CI Passed Hera at
Built and ran in directory /scratch1/NCEPDEV/global/CI/2621

emcbot commented 4 months ago

CI Passed Hercules at
Built and ran in directory /work2/noaa/stmp/CI/HERCULES/2621

emcbot commented 4 months ago

Experiment C96_atm3DVar_extended_98051ebd SUCCESS on Wcoss2 at 05/31/24 10:36:34 PM

emcbot commented 4 months ago

All CI Test Cases Passed on Wcoss2:


Experiment C48_ATM_98051ebd *** SUCCESS *** at 05/31/24 03:03:37 PM
Experiment C48_S2SW_98051ebd *** SUCCESS *** at 05/31/24 03:51:13 PM
Experiment C96C48_hybatmDA_98051ebd *** SUCCESS *** at 05/31/24 04:36:28 PM
Experiment C96_atm3DVar_extended_98051ebd *** SUCCESS *** at 05/31/24 10:36:34 PM
emcbot commented 4 months ago

CI Passed Orion at
Built and ran in directory /work2/noaa/stmp/CI/ORION/2621