NOAA-EMC / GDASApp

Global Data Assimilation System Application
GNU Lesser General Public License v2.1
14 stars 28 forks source link

Move Atmospheric YAMLs to JCB repo #1033

Closed danholdaway closed 2 months ago

danholdaway commented 3 months ago

Draft until https://github.com/NOAA-EMC/global-workflow/pull/2420 is merged since that seems to be causing a failure of test_gdasapp_atm_jjob_var_final. Once merged I'll finish testing and make sure the cycling works.

But in the mean time folks can review the structural changes of brining in the JCB repo.

GW PR: https://github.com/NOAA-EMC/global-workflow/pull/2477

CoryMartin-NOAA commented 3 months ago

@danholdaway I merged in this: https://github.com/NOAA-EMC/GDASApp/pull/997 these changes probably need incorporated in JCB-gdas too?

CoryMartin-NOAA commented 3 months ago

No real comments, everything looks good to me. Is there a reason why snake_case is used for all the keys in the YAML? I think including spaces is better from a readability standpoint for these high level config variables, but if they are used in variable names, paths, etc. then it makes sense to use snake_case

danholdaway commented 3 months ago

Are you referring to e.g. jcb-base.yaml? I was worried about the use of spaces when it comes to Jinja2, though I air on the side of caution with J2 since I don't know it all that well. For example here I presume npx: {{npx ges}} would not work. Rather than have a mixture of underscores and spaces I went will all underscores.

danholdaway commented 3 months ago

@danholdaway I merged in this: #997 these changes probably need incorporated in JCB-gdas too?

Thanks. I'll add those YAMLs to jcb-gdas and merge/delete them from gdas.

CoryMartin-NOAA commented 3 months ago

@danholdaway yeah like this:

# Assimilation window
# -------------------
window_begin: "{{ ATM_WINDOW_BEGIN | to_isotime }}"
window_length: "{{ ATM_WINDOW_LENGTH }}"
bound_to_include: begin

I didn't know if there was a reason to do window_begin vs window begin but it sounds like it is in order to be consistent throughout, which I'm okay with.

danholdaway commented 3 months ago

@danholdaway yeah like this:

# Assimilation window
# -------------------
window_begin: "{{ ATM_WINDOW_BEGIN | to_isotime }}"
window_length: "{{ ATM_WINDOW_LENGTH }}"
bound_to_include: begin

I didn't know if there was a reason to do window_begin vs window begin but it sounds like it is in order to be consistent throughout, which I'm okay with.

My thinking is that using snake means consistency across code and YAMLs. For example ATM_WINDOW_BEGIN has to have underscores because of Jinja2. Similarly if you had a variable in the code it would have underscores. JEDI YAMLs do use spaces sometimes but I figured we could keep everything in jcb consistent in code and YAMLs by using underscores.

danholdaway commented 3 months ago

@RussTreadon-NOAA I'm not sure the latter run will work yet. I was waiting on testing them until I could get all the GDAS tests passing, which is waiting on a PR in global-workflow to bring the two repos back in sync.

RussTreadon-NOAA commented 3 months ago

@danholdaway , I hope g-w PR #2420 can enter g-w develop soon. It's working in my tests and so I approved it.

RussTreadon-NOAA commented 2 months ago

GDASApp feature/use_jcb_atm was tested inside g-w danholdaway:feature/use_jcb_atm using g-w JEDI ATM CI. Details are found here. JEDI ATM CI testing required modifications to three files in GDASApp feature/use_jcb_atm

The above changes were committed to feature/use_jcb_atm at dc6bd08.

RussTreadon-NOAA commented 2 months ago

JEDI ATM CI testing on Orion caught the fact that due to ioda dump filename inconsistencies, we need to remove conv_ps and gnssro from parm/atm/jcb-prototype_3dvar.yaml.j2 and parm/atm/jcb-prototype_lgetkf.yaml.j2.

--- a/parm/atm/jcb-prototype_3dvar.yaml.j2
+++ b/parm/atm/jcb-prototype_3dvar.yaml.j2
@@ -7,8 +7,6 @@ algorithm: 3dvar
 observations:
   - satwnd.abi_goes-16
   - ascatw.ascat_metop-b
-  - conv_ps
-  - gnssro
   - ompsnp_npp
   - ompstc_npp
   - omi_aura
diff --git a/parm/atm/jcb-prototype_lgetkf.yaml.j2 b/parm/atm/jcb-prototype_lgetkf.yaml.j2
index c6a3829..73101d4 100644
--- a/parm/atm/jcb-prototype_lgetkf.yaml.j2
+++ b/parm/atm/jcb-prototype_lgetkf.yaml.j2
@@ -12,5 +12,3 @@ algorithm: local_ensemble_da
 observations:
   - satwnd.abi_goes-16
   - ascatw.ascat_metop-b
-  - conv_ps
-  - gnssro
--- a/parm/atm/jcb-prototype_3dvar.yaml.j2
+++ b/parm/atm/jcb-prototype_3dvar.yaml.j2
@@ -7,8 +7,6 @@ algorithm: 3dvar
 observations:
   - satwnd.abi_goes-16
   - ascatw.ascat_metop-b
-  - conv_ps
-  - gnssro
   - ompsnp_npp
   - ompstc_npp
   - omi_aura
diff --git a/parm/atm/jcb-prototype_lgetkf.yaml.j2 b/parm/atm/jcb-prototype_lgetkf.yaml.j2
index c6a3829..73101d4 100644
--- a/parm/atm/jcb-prototype_lgetkf.yaml.j2
+++ b/parm/atm/jcb-prototype_lgetkf.yaml.j2
@@ -12,5 +12,3 @@ algorithm: local_ensemble_da
 observations:
   - satwnd.abi_goes-16
   - ascatw.ascat_metop-b
-  - conv_ps
-  - gnssro

At present the prepatmioda job creates ioda dump files ${prefix}.conventional_ps.prepbufr.nc and ${prefix}.gnssro.tm00.bufr_d.nc. Neither of these names follow the pattern assumed by jcb.

RussTreadon-NOAA commented 2 months ago

Merge GDASApp develop at 8fdaf3d into feature/use_jcb_atm. Build feature/use_jcb_atm inside working copy of g-w danholdaway:feature/use_jcb_atm. Run test_gdasapp. 38 out of 45 tests pass. The following 7 tests fail.

84% tests passed, 7 tests failed out of 45

Label Time Summary:
gdas-utils    =  19.28 sec*proc (9 tests)
script        =  19.28 sec*proc (9 tests)

Total Test time (real) = 805.78 sec

The following tests FAILED:
        1774 - test_gdasapp_atm_jjob_var_init (Failed)
        1775 - test_gdasapp_atm_jjob_var_run (Failed)
        1776 - test_gdasapp_atm_jjob_var_inc (Failed)
        1777 - test_gdasapp_atm_jjob_var_final (Failed)
        1778 - test_gdasapp_atm_jjob_ens_init (Failed)
        1779 - test_gdasapp_atm_jjob_ens_run (Failed)
        1780 - test_gdasapp_atm_jjob_ens_final (Failed)

The init jobs failed due to inconsistencies between the expected and actual name of the observation dump files. The names of the dump files in the test data need to be updated to be consistent with the convention jcb assumes.

danholdaway commented 2 months ago

Thanks @RussTreadon-NOAA, those tests were passing for me but I'm not surprised they are failing now. Do you see which observation it is? I thought we were only testing with sondes and amsua-n19 and that these were already converted to nc4.

RussTreadon-NOAA commented 2 months ago

test_gdasapp_atm_jjob_var_init is looking for an ioda dump file named gdas.t18z.amsua_n19.tm00.nc. The test data directory has file gdas.t18z.amsua_n19.2021032318.nc. The ctest aborts with

  File "/work2/noaa/da/rtreadon/git/global-workflow/use_jcb_atm/ush/python/wxflow/fsutils.py", line 85, in cp
    raise OSError(f"unable to copy {source} to {target}")
OSError: unable to copy /work2/noaa/da/rtreadon/git/global-workflow/use_jcb_atm/sorc/gdas.cd/build/gdas/test/atm/global-workflow/testrun/ROTDIRS/gdas_test/gdas.20210323/18/obs/gdas.t18z.amsua_n19.tm00.nc to /work2/noaa/da/rtreadon/git/global-workflow/use_jcb_atm/sorc/gdas.cd/build/gdas/test/atm/global-workflow/testrun/RUNDIRS/gdas_test/gdasatmanl_18/obs//gdas.t18z.amsua_n19.tm00.nc
+ slurm_script[1]: postamble slurm_script 1714502574 1

Your branch was using the YYYYMMDDHH string. The prepatmioda job writes files with tm00. This is one of the changes I committed yesterday via dc6bd08.

We need to settle on a naming convention.

RussTreadon-NOAA commented 2 months ago

@danholdaway , anything I can do to help move this PR along?

RussTreadon-NOAA commented 2 months ago

Based on testing reported in g-w PR #2477, one possible set of changes should be considered for this PR.

Due to inconsistencies between ioda dump file names generated by prepatmiodaobs and JCB templates, we should remove conv_ps and gnssro entries from parm/atm/jcb-prototype_3dvar.yaml.j2 and parm/atm/jcb-prototype_lgetkf.yaml.j2. This change is required for JEDI ATM CI to be successfully run via the g-w code in PR #2477.

Once this PR, #1033, is approved and merged into GDASApp develop, we need to update the gdas.cd hash in g-w PR #2477. We can then merge g-w PR #2477 into g-w develop.

danholdaway commented 2 months ago

@RussTreadon-NOAA I made the requested changes to the parm/atm/*yaml.j2 files. Hopefully that is now everything needed on the GDAS side to have all tests passing for you.

RussTreadon-NOAA commented 2 months ago

Thank you @danholdaway for commenting out the conv_ps and gnssro entries in parm files. With this change feature/use_jcb_atm at a915c40 can successfully run the atmanlinit and atmensanlinit jobs via g-w PR #2477.

danholdaway commented 2 months ago

@RussTreadon-NOAA I think this is ready to test again now. I've made the adjustment the ensemble path. I'll rerun the GDAS tests.

Note that I created a new branch in G-W that is the combination of the jcb and single executable work since the two repos are out of sync now. https://github.com/danholdaway/global-workflow/tree/feature/use_jcb_atm_and_build_jedi_exe

danholdaway commented 2 months ago

Thanks @RussTreadon-NOAA. We should be about ready then. Perhaps the order of things should be:

  1. Merge https://github.com/NOAA-EMC/global-workflow/pull/2565
  2. Merge this PR in GDASapp
  3. Update hashes on GW side
  4. Run GW CI, fix, merge.
RussTreadon-NOAA commented 2 months ago

Orion test

Update a working copy of feature/use_jcb_atm with develop at 05bb641. Build and run ctest -R test_gdasapp. All 45 tests pass.

Test project /work2/noaa/da/rtreadon/git/global-workflow/use_jcb_atm/sorc/gdas.cd/build
      Start 1400: test_gdasapp_util_coding_norms
 1/45 Test #1400: test_gdasapp_util_coding_norms ........................   Passed    1.54 sec

...

      Start 1781: test_gdasapp_aero_gen_3dvar_yaml
45/45 Test #1781: test_gdasapp_aero_gen_3dvar_yaml ......................   Passed    0.31 sec

100% tests passed, 0 tests failed out of 45

Label Time Summary:
gdas-utils    =   3.69 sec*proc (9 tests)
script        =   3.69 sec*proc (9 tests)

Total Test time (real) = 1561.92 sec
danholdaway commented 2 months ago

Thanks @RussTreadon-NOAA. @CoryMartin-NOAA @guillaumevernieres should we merge then? I can make the updates to G-W and request that the CI gets underway.