NOAA-EMC / global-workflow

Global Superstructure/Workflow supporting the Global Forecast System (GFS)
https://global-workflow.readthedocs.io/en/latest
GNU Lesser General Public License v3.0
74 stars 165 forks source link

gdascleanup and enkfgdascleanup failures #2880

Closed XuanliLi-NOAA closed 1 week ago

XuanliLi-NOAA commented 2 weeks ago

What is wrong?

gdascleanup and enkfgdascleanup jobs failed with recent build of the global workflow. Both gdas and enkfgdas directories are not being scrubbed correctly in every cycle.

Here are the error messages:

gdascleanup.log: exglobal_cleanup.sh[49]: find_exclude_string=' -name prepbufr -or -name prepbufr -or -name cnvstat -or -name prepbufr -or -name prepbufr -or -name cnvstat -or -name *atmanl.nc '

enkfgdascleanup.log:

What should have happened?

The gdas and enkfgdas directories should be cleaned.

What machines are impacted?

Hera

Steps to reproduce

Run global workflow (Hash # ea22a737ee9a815f1f294141abf85e0d1515868f) on Hera. Resolution C384+C192.

Additional information

N/A

Do you have a proposed solution?

No response

WalterKolczynski-NOAA commented 1 week ago

Is 20220824/18 a date that was actually run, or is it just trying to delete data from before the start of the experiment period?

XuanliLi-NOAA commented 1 week ago

Cycle 20220824/18 is the one that actually ran, I couldn't go further due to disk limit. I manually deleted the directories under 00, 06, and 12 to free up space, but I kept mem001 in those directories in case you need to see what files were stored.

DavidHuber-NOAA commented 1 week ago

I (unintentionally) replicated this issue on WCOSS2 just now. The issue is that exglobal_cleanup.sh deletes its own working directory (${DATAROOT}/cleanup.${jobid}) when it deletes ${DATAROOT}: https://github.com/NOAA-EMC/global-workflow/blob/2e4f4b7671cfec83331ec26de39218543d7b9a6d/scripts/exglobal_cleanup.sh#L19