Closed DavidNew-NOAA closed 1 week ago
@DavidHuber-NOAA Thanks for the suggestions. I tried tabbing everything for readability, but it generated errors. Let me retry tabbing things, and if I get the same errors, maybe I can run them by you and get some feedback on debugging.
@DavidHuber-NOAA So here's the error I get when I tab out the for-loops in parm/gdas/staging/atm_var_bkg.yaml.j2. I'm wondering if Jinja2 is expecting a certain number spaces or something of that nature.
"expected <block end>, but found %r" % token.id, token.start_mark)
yaml.parser.ParserError: while parsing a block collection
in "<unicode string>", line 12, column 4:
- ['/work/noaa/da/dnew/global-wo ...
^
expected <block end>, but found '<block sequence start>'
in "<unicode string>", line 16, column 10:
- ['/work/noaa/da/dnew/global-wo ...
^
@DavidNew-NOAA Tabbing should only be applied to lines of Jinja code. The yaml-specific lines have to have their tabbing maintained to be in the expected format. Here is an example: https://github.com/NOAA-EMC/global-workflow/blob/e7909af8d9e1f34140388a3f8556d8e582c58fe5/parm/archive/arcdir.yaml.j2#L24-L28
@DavidHuber-NOAA Ah, thank you. I struggled a lot last week trying to figure out tabbing.
@aerorahul Done
Hera test
Install feature/stage_from_yaml
at 7aa041e0 on Hera in /scratch1/NCEPDEV/da/role.jedipara/git/global-workflow/stage_from_yaml
. Run C96C48_ufs_hybatmDA CI. 20240224 00Z gdasatmanlinit, enkfgdasatmensanlinit, and gfsatmanlinit abort with the following messages
gdasatmanlinit
Traceback (most recent call last):
File "/scratch1/NCEPDEV/da/role.jedipara/git/global-workflow/stage_from_yaml/scripts/exglobal_atm_analysis_initialize.py", line 23, in <module>
AtmAnl = AtmAnalysis(config)
File "/scratch1/NCEPDEV/da/python/gdasapp/wxflow/20240307/src/wxflow/logger.py", line 266, in wrapper
retval = func(*args, **kwargs)
File "/scratch1/NCEPDEV/da/role.jedipara/git/global-workflow/stage_from_yaml/ush/python/pygfs/task/atm_analysis.py", line 29, in __init__
super().__init__(config)
File "/scratch1/NCEPDEV/da/role.jedipara/git/global-workflow/stage_from_yaml/ush/python/pygfs/task/analysis.py", line 30, in __init__
self.gdasapp_j2tmpl_dir = os.path.join(self.task_config.PARMgfs, 'gdas')
AttributeError: 'AtmAnalysis' object has no attribute 'task_config'
+ JGLOBAL_ATM_ANALYSIS_INITIALIZE[1]: postamble JGLOBAL_ATM_ANALYSIS_INITIALIZE 1718909230 1
enkfgdasatmensanlinit
Traceback (most recent call last):
File "/scratch1/NCEPDEV/da/role.jedipara/git/global-workflow/stage_from_yaml/scripts/exglobal_atmens_analysis_initialize.py", line 23, in <module>
AtmEnsAnl = AtmEnsAnalysis(config)
File "/scratch1/NCEPDEV/da/python/gdasapp/wxflow/20240307/src/wxflow/logger.py", line 266, in wrapper
retval = func(*args, **kwargs)
File "/scratch1/NCEPDEV/da/role.jedipara/git/global-workflow/stage_from_yaml/ush/python/pygfs/task/atmens_analysis.py", line 30, in __init__
super().__init__(config)
File "/scratch1/NCEPDEV/da/role.jedipara/git/global-workflow/stage_from_yaml/ush/python/pygfs/task/analysis.py", line 30, in __init__
self.gdasapp_j2tmpl_dir = os.path.join(self.task_config.PARMgfs, 'gdas')
AttributeError: 'AtmEnsAnalysis' object has no attribute 'task_config'
+ JGLOBAL_ATMENS_ANALYSIS_INITIALIZE[1]: postamble JGLOBAL_ATMENS_ANALYSIS_INITIALIZE 1718909230 1
gfsatmanlinit
Traceback (most recent call last):
File "/scratch1/NCEPDEV/da/role.jedipara/git/global-workflow/stage_from_yaml/scripts/exglobal_atm_analysis_initialize.py", line 23, in <module>
AtmAnl = AtmAnalysis(config)
File "/scratch1/NCEPDEV/da/python/gdasapp/wxflow/20240307/src/wxflow/logger.py", line 266, in wrapper
retval = func(*args, **kwargs)
File "/scratch1/NCEPDEV/da/role.jedipara/git/global-workflow/stage_from_yaml/ush/python/pygfs/task/atm_analysis.py", line 29, in __init__
super().__init__(config)
File "/scratch1/NCEPDEV/da/role.jedipara/git/global-workflow/stage_from_yaml/ush/python/pygfs/task/analysis.py", line 30, in __init__
self.gdasapp_j2tmpl_dir = os.path.join(self.task_config.PARMgfs, 'gdas')
AttributeError: 'AtmAnalysis' object has no attribute 'task_config'
+ JGLOBAL_ATM_ANALYSIS_INITIALIZE[1]: postamble JGLOBAL_ATM_ANALYSIS_INITIALIZE 1718909230 1
@RussTreadon-NOAA This PR also updates the wxflow hash. I missed the latest commit and just re-updated. task_config is now created in the wxflow Task class, not the Analysis subclasses in G-W
Thank you @DavidNew-NOAA . Updated to e175b252. Failed gdasatmanlinit jobs rewound and rebooted. Same error in log file
File "/scratch1/NCEPDEV/da/role.jedipara/git/global-workflow/stage_from_yaml/ush/python/pygfs/task/atm_analysis.py", line 29, in __init__
super().__init__(config)
File "/scratch1/NCEPDEV/da/role.jedipara/git/global-workflow/stage_from_yaml/ush/python/pygfs/task/analysis.py", line 30, in __init__
self.gdasapp_j2tmpl_dir = os.path.join(self.task_config.PARMgfs, 'gdas')
AttributeError: 'AtmAnalysis' object has no attribute 'task_config'
I confirmed that sorc/wxflow
is the specified hash
Hera(hfe08):/scratch1/NCEPDEV/da/role.jedipara/git/global-workflow/stage_from_yaml/sorc/wxflow$ git branch
* (HEAD detached at 5dad7dd)
develop
@RussTreadon-NOAA Ah yes, I forgot. hera.intel.lua loads wxflow as a hack here. GDASApp on Hera is not actually using the wxflow in the G-W.
@DavidNew-NOAA . I commented out the Hera wxflow hack, rewound and rebooted. This worked! The CI test is once again running. I'll check for completion later tonight.
Hera C96C48_ufs_hybatmDA CI
All jobs successfully ran to completion with the wxflow hack commented out in sorc/gdas.cd/modulefiles/GDAS/hera.intel.lua
.
Hera(hfe07):/scratch1/NCEPDEV/stmp2/role.jedipara/EXPDIR/prstage$ rocotostat -d prstage.db -w prstage.xml -c all -s
CYCLE STATE ACTIVATED DEACTIVATED
202402231800 Done Jun 20 2024 17:18:52 Jun 20 2024 18:30:18
202402240000 Done Jun 20 2024 17:18:52 Jun 20 2024 23:50:17
Similar wxflow hacks are also in gaea.intel.lua
and noaacloud.intel.lua
. However, the wxflow hack lines in these modulefiles are commented out. A GDASApp issue and PR should be opened to remove wxflow hacks, both active and commented out.
CI was run under role.jedipara. This account can not use the fv3-cpu
accounting code. It can only use da-cpu
. Added ACCOUNT_SERVICE
to ci/cases/yamls/ufs_hybatmDA_defaults.ci.yaml
to set the service queue accounting code
@@ -5,6 +5,7 @@ base:
DO_JEDIATMVAR: "YES"
DO_JEDIATMENS: "YES"
ACCOUNT: {{ 'HPC_ACCOUNT' | getenv }}
+ ACCOUNT_SERVICE: {{ 'HPC_ACCOUNT_SERVICE' | getenv }}
atmanl:
LAYOUT_X_ATMANL: 4
LAYOUT_Y_ATMANL: 4
Merge DavidNew-NOAA:feature/stage_from_yaml
into RussTreadon-NOAA:feature/rename_atm
. No conflicts. Will install local merge in role.jedipara and run GDASApp ctests plus C96C48_ufs_hybatmDA CI. Will push local merge to RussTreadon-NOAA:feature/rename_atm
pending successful tests.
@DavidNew-NOAA and @CoryMartin-NOAA : ran test_gdasapp
from install of DavidNew-NOAA:feature/stage_from_yaml inside g-w. 47 of 48 tests pass. The one failure is test_gdasapp_aero_gen_3dvar_yaml
1869: Test command: /scratch1/NCEPDEV/da/role.jedipara/git/global-workflow/stage_from_yaml/sorc/gdas.cd/bundle/gdas/test/aero/genyaml_3dvar.sh "/scratch1/NCEPDEV/da/role.jedipara/git/global-workflow/stage_from_yaml/sorc/gdas.cd/build/gdas" "/scratch1/NCEPDEV/da/role.jedipara/git/global-workflow/stage_from_yaml/sorc/gdas.cd/bundle/gdas" "WORKING" "DIRECTORY" "/scratch1/NCEPDEV/da/role.jedipara/git/global-workflow/stage_from_yaml/sorc/gdas.cd/build/gdas/test/testrun/"
1869: Test timeout computed to be: 1500
1869: Traceback (most recent call last):
1869: File "<stdin>", line 1, in <module>
1869: ModuleNotFoundError: No module named 'wxflow'
1/1 Test #1869: test_gdasapp_aero_gen_3dvar_yaml ...***Failed 0.24 sec
Script sorc/gdas.cd/test/aero/genyaml_3dvar.sh
contains
# run some python code to generate the YAML
python3 - <<EOF
from wxflow import parse_j2yaml
import datetime
valid_time_obj = datetime.datetime.strptime('$CDATE','%Y%m%d%H')
Without the wxflow hack in modulefiles/GDAS/hera.intel.lua
, wxflow is not defined.
sorc/gdas.cd/test/atm/global-workflow/jjob_var_run.sh
contains
# Set python path for workflow utilities and tasks
wxflowPATH="${HOMEgfs}/ush/python:${HOMEgfs}/ush/python/wxflow"
PYTHONPATH="${PYTHONPATH:+${PYTHONPATH}:}${wxflowPATH}"
export PYTHONPATH
A similar approach could be considered for genyaml_3dvar.sh
. This change goes in GDASApp, not g-w.
Hera tests
Make the following local change to test/aero/genyaml_3dvar.sh
in merged copy of feature/stage_from_yaml
and feature/rename_atm
@@ -24,6 +24,15 @@ export YAMLout=$DATA/3dvar_gfs_aero.yaml
rm -rf $DATA
mkdir -p $DATA
+# Set g-w HOMEgfs
+topdir=$(cd "$(dirname "$(readlink -f -n "${bindir}" )" )/../../.." && pwd -P)
+export HOMEgfs=$topdir
+
+# Set python path for workflow utilities and tasks
+wxflowPATH="${HOMEgfs}/ush/python:${HOMEgfs}/ush/python/wxflow"
+PYTHONPATH="${PYTHONPATH:+${PYTHONPATH}:}${wxflowPATH}"
+export PYTHONPATH
+
# run some python code to generate the YAML
python3 - <<EOF
from wxflow import parse_j2yaml
This addition was made in response to removing the wxflow hack
+++ b/modulefiles/GDAS/hera.intel.lua
@@ -74,9 +74,6 @@ load("py-xarray/2023.7.0")
load("py-f90nml/1.4.3")
load("py-pip/23.1.2")
--- hack for wxflow
-prepend_path("PYTHONPATH", "/scratch1/NCEPDEV/da/python/gdasapp/wxflow/20240307/src")
-
setenv("CC","mpiicc")
setenv("FC","mpiifort")
setenv("CXX","mpiicpc")
from modulefiles/GDAS/hera.intel.lua
With these changes in place, rerun test_gdasapp
ctests. 48 out of 48 tests pass
Test project /scratch1/NCEPDEV/da/role.jedipara/git/global-workflow/merge/sorc/gdas.cd/build
Start 1488: test_gdasapp_util_coding_norms
1/48 Test #1488: test_gdasapp_util_coding_norms ........................ Passed 3.20 sec
...
Start 1869: test_gdasapp_aero_gen_3dvar_yaml
48/48 Test #1869: test_gdasapp_aero_gen_3dvar_yaml ...................... Passed 1.17 sec
100% tests passed, 0 tests failed out of 48
Label Time Summary:
gdas-utils = 11.66 sec*proc (11 tests)
script = 11.66 sec*proc (11 tests)
Total Test time (real) = 2082.20 sec
g-w C96C48_ufs_hybatmDA CI successfully ran all jobs.
Hera(hfe06):/scratch1/NCEPDEV/stmp2/role.jedipara/EXPDIR/prmerge$ rocotostat -d prmerge.db -w prmerge.xml -c all -s
CYCLE STATE ACTIVATED DEACTIVATED
202402231800 Done Jun 21 2024 02:24:10 Jun 21 2024 02:45:12
202402240000 Done Jun 21 2024 02:24:10 Jun 21 2024 09:25:11
Given this push merger of feature/stage_from_yaml
into feature/rename_atm
to github. Done at b00d31e5.
This PR, #2654, may be closed since it has been folded into PR #2700 as per EIB's request to merge these two PRs into one.
NOTE: The above changes to two files in gdas.cd
are only in the local working copy. These changes need to be committed to GDASApp develop
and the gdas.cd
hash updated in feature/rename_atm
.
I suggest we make a note/issue to fix this in GDASApp (which it looks like @RussTreadon-NOAA already did), and not let the aero gen YAML test hold up this PR to the g-w
CI Update on Wcoss2 at 06/25/24 02:51:16 PM
============================================
Cloning and Building global-workflow PR: 2654
with PID: 161823 on host: dlogin08
Automated global-workflow Testing Results:
Machine: Wcoss2
Start: Tue Jun 25 14:55:52 UTC 2024 on dlogin08
---------------------------------------------------
Build: Completed at 06/25/24 03:31:18 PM
Case setup: Completed for experiment C48_ATM_e175b252
Case setup: Skipped for experiment C48mx500_3DVarAOWCDA_e175b252
Case setup: Skipped for experiment C48_S2SWA_gefs_e175b252
Case setup: Completed for experiment C48_S2SW_e175b252
Case setup: Completed for experiment C96_atm3DVar_extended_e175b252
Case setup: Skipped for experiment C96_atm3DVar_e175b252
Case setup: Skipped for experiment C96_atmaerosnowDA_e175b252
Case setup: Completed for experiment C96C48_hybatmDA_e175b252
Case setup: Completed for experiment C96C48_ufs_hybatmDA_e175b252
Experiment C48_ATM_e175b252 SUCCESS on Wcoss2 at 06/25/24 04:44:07 PM
Experiment C48_S2SW_e175b252 SUCCESS on Wcoss2 at 06/25/24 04:48:09 PM
I see CI testing happening on this branch, but wasn't this PR was combined with #2700 ?
@DavidNew-NOAA you are right. This PR should be closed.
Closing as this PR is combined w/ #2700
Experiment C96C48_hybatmDA_e175b252 SUCCESS on Wcoss2 at 06/25/24 05:36:19 PM
Experiment C96C48_ufs_hybatmDA_e175b252 SUCCESS on Wcoss2 at 06/25/24 05:48:13 PM
Experiment C96_atm3DVar_extended_e175b252 SUCCESS on Wcoss2 at 06/26/24 02:04:39 AM
All CI Test Cases Passed on Wcoss2:
Experiment C48_ATM_e175b252 *** SUCCESS *** at 06/25/24 04:44:07 PM
Experiment C48_S2SW_e175b252 *** SUCCESS *** at 06/25/24 04:48:09 PM
Experiment C96C48_hybatmDA_e175b252 *** SUCCESS *** at 06/25/24 05:36:19 PM
Experiment C96C48_ufs_hybatmDA_e175b252 *** SUCCESS *** at 06/25/24 05:48:13 PM
Experiment C96_atm3DVar_extended_e175b252 *** SUCCESS *** at 06/26/24 02:04:39 AM
Description
This PR will move much of the staging code that take place in the python initialization subroutines of the variational and ensemble DA jobs into Jinja2-templated YAML files to be passed into the wxflow file handler. Much of the staging has already been done this way, but this PR simply expands that strategy.
The old Python routines that were doing this staging are now removed. This is part of a broader refactoring of the pygfs tasking.
wxflow PR #30 is a companion to this PR.
Type of change
Change characteristics
How has this been tested?
Checklist