Closed RussTreadon-NOAA closed 2 weeks ago
Thanks you for this effort Russ.
@AndrewEichmann-NOAA , I updated feature/resume_nightly
with GDASApp develop.
This brought in changes from #1352. Now g-w gdas_marinefinal
fails with
0: ========= Processing /work/noaa/da/rtreadon/git/global-workflow/pr2992/sorc/gdas.cd/build/gdas/test/gw-ci/../../test/gw-ci/WCDA-3DVAR-C48mx500/COMROOT/WCDA-3DVAR-C48mx500/gdas.20210324/18//analysis/ocean/diags/insitu_surface_trkob.2021032418.nc4 date: 2021032418
0: insitu_surface_trkob.2021032418.nc4: read database from /work/noaa/da/rtreadon/git/global-workflow/pr2992/sorc/gdas.cd/build/gdas/test/gw-ci/../../test/gw-ci/WCDA-3DVAR-C48mx500/COMROOT/WCDA-3DVAR-C48mx500/gdas.20210324/18//analysis/ocean/diags/insitu_surface_trkob.2021032418.nc4 (io pool size: 1)
0: insitu_surface_trkob.2021032418.nc4 processed vars: 2 Variables: seaSurfaceSalinity, seaSurfaceTemperature
0: insitu_surface_trkob.2021032418.nc4 assimilated vars: 1 Variables: seaSurfaceSalinity
0: nlocs =863
0: Exception: Reason: An exception occurred inside ioda while opening a variable.
0: name: ombg/seaSurfaceSalinity
0: source_column: 0
0: source_filename: /work/noaa/da/rtreadon/git/global-workflow/pr2992/sorc/gdas.cd/bundle/ioda/src/engines/ioda/src/ioda/Has_Variables.cpp
Is this failure possibly related to #1352?
GDASApp PR #1374 modifies test/marine/CMakeLists.txt
such that the correct python version is set for test_gdasapp_bufr2ioda_insitu*
. With this change in place all test_gdasapp_bufr2ioda_insitu*
pass
Test project /work/noaa/da/rtreadon/git/global-workflow/pr2992/sorc/gdas.cd/build
Start 1993: test_gdasapp_bufr2ioda_insitu_profile_argo
1/8 Test #1993: test_gdasapp_bufr2ioda_insitu_profile_argo ....... Passed 52.78 sec
Start 1994: test_gdasapp_bufr2ioda_insitu_profile_bathy
2/8 Test #1994: test_gdasapp_bufr2ioda_insitu_profile_bathy ...... Passed 3.72 sec
Start 1995: test_gdasapp_bufr2ioda_insitu_profile_glider
3/8 Test #1995: test_gdasapp_bufr2ioda_insitu_profile_glider ..... Passed 3.62 sec
Start 1996: test_gdasapp_bufr2ioda_insitu_profile_tesac
4/8 Test #1996: test_gdasapp_bufr2ioda_insitu_profile_tesac ...... Passed 5.71 sec
Start 1997: test_gdasapp_bufr2ioda_insitu_profile_tropical
5/8 Test #1997: test_gdasapp_bufr2ioda_insitu_profile_tropical ... Passed 3.33 sec
Start 1998: test_gdasapp_bufr2ioda_insitu_profile_xbtctd
6/8 Test #1998: test_gdasapp_bufr2ioda_insitu_profile_xbtctd ..... Passed 2.62 sec
Start 1999: test_gdasapp_bufr2ioda_insitu_surface_drifter
7/8 Test #1999: test_gdasapp_bufr2ioda_insitu_surface_drifter .... Passed 2.41 sec
Start 2000: test_gdasapp_bufr2ioda_insitu_surface_trkob
8/8 Test #2000: test_gdasapp_bufr2ioda_insitu_surface_trkob ...... Passed 2.86 sec
100% tests passed, 0 tests failed out of 8
Total Test time (real) = 78.83 sec
@AndrewEichmann-NOAA , I rolled back the change to parm/soca/obs/obs_list.yaml
from #1352 and reran the test_gdasapp_WCDA-3DVAR-C48mx500
suite of tests. All passed
Test project /work/noaa/da/rtreadon/git/global-workflow/pr2992/sorc/gdas.cd/build
Start 1960: test_gdasapp_WCDA-3DVAR-C48mx500
1/9 Test #1960: test_gdasapp_WCDA-3DVAR-C48mx500 .................................... Passed 32.22 sec
Start 1961: test_gdasapp_WCDA-3DVAR-C48mx500_gdas_stage_ic_202103241200
2/9 Test #1961: test_gdasapp_WCDA-3DVAR-C48mx500_gdas_stage_ic_202103241200 ......... Passed 58.00 sec
Start 1962: test_gdasapp_WCDA-3DVAR-C48mx500_gdas_fcst_seg0_202103241200
3/9 Test #1962: test_gdasapp_WCDA-3DVAR-C48mx500_gdas_fcst_seg0_202103241200 ........ Passed 408.83 sec
Start 1963: test_gdasapp_WCDA-3DVAR-C48mx500_gdas_prepoceanobs_202103241800
4/9 Test #1963: test_gdasapp_WCDA-3DVAR-C48mx500_gdas_prepoceanobs_202103241800 ..... Passed 266.16 sec
Start 1964: test_gdasapp_WCDA-3DVAR-C48mx500_gdas_marinebmat_202103241800
5/9 Test #1964: test_gdasapp_WCDA-3DVAR-C48mx500_gdas_marinebmat_202103241800 ....... Passed 168.43 sec
Start 1965: test_gdasapp_WCDA-3DVAR-C48mx500_gdas_marineanlinit_202103241800
6/9 Test #1965: test_gdasapp_WCDA-3DVAR-C48mx500_gdas_marineanlinit_202103241800 .... Passed 111.26 sec
Start 1966: test_gdasapp_WCDA-3DVAR-C48mx500_gdas_marineanlvar_202103241800
7/9 Test #1966: test_gdasapp_WCDA-3DVAR-C48mx500_gdas_marineanlvar_202103241800 ..... Passed 168.09 sec
Start 1967: test_gdasapp_WCDA-3DVAR-C48mx500_gdas_marineanlchkpt_202103241800
8/9 Test #1967: test_gdasapp_WCDA-3DVAR-C48mx500_gdas_marineanlchkpt_202103241800 ... Passed 180.57 sec
Start 1968: test_gdasapp_WCDA-3DVAR-C48mx500_gdas_marineanlfinal_202103241800
9/9 Test #1968: test_gdasapp_WCDA-3DVAR-C48mx500_gdas_marineanlfinal_202103241800 ... Passed 68.63 sec
100% tests passed, 0 tests failed out of 9
Label Time Summary:
manual = 1462.19 sec*proc (9 tests)
Total Test time (real) = 1463.94 sec
Does failure of test_gdasapp_WCDA-3DVAR-C48mx500_gdas_marineanlfinal_202103241800
with the PR #1352 parm/soca/obs/obs_list.yaml
make sense?
@guillaumevernieres and @AndrewEichmann-NOAA : test_gdasapp_WCDA-hyb-C48mx500_gdas_marineanlletkf_202103241800
fails with the error
^[[38;5;39m2024-11-14 03:11:43,830 - DEBUG - marine_da_utils: Executing srun -l --export=ALL --hint=nomultithread -n 16 /work/noaa/da/rtreadon/git/global-workflow/pr2992/exec/gdas_soca_gridgen.x /work/noaa/da/rtreadon/git/global-workflow/pr2992/parm/gdas/soca/gridgen/gridgen.yaml^[[0m
2: Exception: Cannot open /work/noaa/da/rtreadon/git/global-workflow/pr2992/parm/gdas/soca/gridgen/gridgen.yaml (No such file or directory)
There is no g-w directory parm/gdas/soca/gridgen
. I checked g-w PR #3041. I do not see any change to sorc/link_workflow.sh
to add this directory to parm/gdas/soca
.
Should test_gdasapp_WCDA-hyb-C48mx500_gdas_marineanlletkf_202103241800
successfully run in GDASApp develop
with g-w develop
? Does this test work when built and run inside g-w PR #3041?
11/14 status
g-w DA CI testing complete on Hercules. 63 out of 64 test_gdasapp
tests pass on Hercules.
Two issues remain to be resolved:
test_gdasapp_WCDA-3DVAR-C48mx500_gdas_marineanlfinal_202103241800
fails when using parm/soca/obs/obs_list.yaml
from GDASApp develop
at 6bc2760. Reverting to the previous version of obs_list.yaml
allows test_gdasapp_WCDA-3DVAR-C48mx500_gdas_marineanlfinal_202103241800
to pass.
test_gdasapp_WCDA-hyb-C48mx500_gdas_marineanlletkf_202103241800
fails for at least two reasons
marineanlletkf
section is missing from g-w env/HERCULES.env
gdas_soca_gridgen.x
fails because input yaml $HOMEgfs/parm/gdas/soca/gridgen/gridgen.yaml
does not existWe can not resume nightly testing until all ctest pass. Given this we need to answer two questions
Do we
develop
so that test_gdasapp_WCDA-3DVAR-C48mx500_gdas_marineanlfinal_202103241800
passes when using obs_list.yaml
from develop
, orobs_list.yaml
, ortest_gdasapp_WCDA-3DVAR-C48mx500_gdas_marineanlfinal_202103241800
?Do we
test_gdasapp_WCDA-hyb-C48mx500_gdas_marineanlletkf_202103241800
by adding the missing parm/gdas/soca/gridgen
to g-w PR #2992, ortest_gdasapp_WCDA-hyb-C48mx500_gdas_marineanlletkf_202103241800
?Tagging @guillaumevernieres , @AndrewEichmann-NOAA , @CoryMartin-NOAA , @danholdaway , @DavidNew-NOAA
@RussTreadon-NOAA With regard to gridgen, I deleted that in GDASApp when I refactored the marine bmat, not realizing another code would use it. It exists now as parm/jcb-gdas/algorithm/marine/gridgen.yaml
, so you can just point to that file until I refactor the rest of the marine code using JCB.
@RussTreadon-NOAA Just to answer your question, I say that I add the following to #2992
marineanlletkf
to gridgen in jcb-gdas per my above commentmarineanlletkf
to env/HERCULES.env
And then we either revert the obs_list.yaml
or fix it
gdas_marineanlletkf failure - RESOLVED
test_gdasapp_WCDA-hyb-C48mx500_gdas_marineanlletkf_202103241800
passes after making the following changes
state variables
in parm/soca/letkf/letkf.yaml.j2
to be consistent with soca PR #1082obs localizatons
blocks to insitu_profile_bathy.yaml
, insitu_profile_tesac.yaml
, and insitu_surface_trkob.yaml
in parm/soca/obs/config/
gdas_marineanlfinal failure - UPDATE
gdas_marineanlfinal
fails when gdassoca_obsstats.x
attempts to extract seaSurfaceSalinity
from the insitu_surface_trkob
diagnostic file
0: insitu_surface_trkob.2021032418.nc4: read database from /work/noaa/da/rtreadon/git/global-workflow/pr2992/sorc/gdas.cd/build/gdas/test/gw-ci/../../test/gw-ci/WCDA-3DVAR-C48mx500/COMROOT/WCDA-3DVAR-C48mx500/gdas.20210324/18//analysis/ocean/diags/insitu_surface_trkob.2021032418.nc4 (io pool size: 1)
0: insitu_surface_trkob.2021032418.nc4 processed vars: 2 Variables: seaSurfaceSalinity, seaSurfaceTemperature
0: insitu_surface_trkob.2021032418.nc4 assimilated vars: 1 Variables: seaSurfaceSalinity
0: nlocs =863
0: Exception: Reason: An exception occurred inside ioda while opening a variable.
0: name: ombg/seaSurfaceSalinity
diags_stats.yaml
specifies variable seaSurfaceSalinity
to be extracted from analysis/ocean/diags/insitu_surface_trkob.2021032418.nc4
.
engine:
type: H5File
obsfile: /work/noaa/da/rtreadon/git/global-workflow/pr2992/sorc/gdas.cd/build/gdas/test/gw-ci/../../test/gw-ci/WCDA-3DVAR-C48mx500/COMROOT/WCDA-3DVAR\
-C48mx500/gdas.20210324/18//analysis/ocean/diags/insitu_surface_trkob.2021032418.nc4
simulated variables:
- seaSurfaceSalinity
variable: seaSurfaceSalinity
This is problematic. Variable seaSurfaceSalinity
is not in the diagnostic file. var.yaml
only specifies seaSurfaceTemperature
to be written to the diagnostic file
obsdataout:
engine:
type: H5File
obsfile: /work/noaa/stmp/rtreadon/HERCULES/RUNDIRS/WCDA-3DVAR-C48mx500/gdas.2021032418/gdasmarineanalysis.2021032418/marinevariational/diags/insitu_surface_trkob.2021032418.nc4
simulated variables:
- seaSurfaceTemperature
io pool:
max pool size: 1
Method obs_space_stats
in ush/python/pygfs/task/marine_analysis.py
creates diag_stats.yaml
. The yaml is populated by querying the diagnostic files in the run directory. Specifically, ObsValue
is checked to determine the variable to add to diag_stats.yaml
. The ObsValue
group contains two variables
group: ObsValue {
variables:
float seaSurfaceSalinity(Location) ;
seaSurfaceSalinity:_FillValue = -3.368795e+38f ;
string seaSurfaceSalinity:long_name = "seaSurfaceSalinity" ;
string seaSurfaceSalinity:units = "psu" ;
seaSurfaceSalinity:valid_range = 0.f, 45.f ;
seaSurfaceSalinity:_Storage = "chunked" ;
seaSurfaceSalinity:_ChunkSizes = 863 ;
seaSurfaceSalinity:_Endianness = "little" ;
float seaSurfaceTemperature(Location) ;
seaSurfaceTemperature:_FillValue = -3.368795e+38f ;
string seaSurfaceTemperature:long_name = "seaSurfaceTemperature" ;
string seaSurfaceTemperature:units = "degC" ;
seaSurfaceTemperature:valid_range = -10.f, 50.f ;
seaSurfaceTemperature:_Storage = "chunked" ;
seaSurfaceTemperature:_ChunkSizes = 863 ;
seaSurfaceTemperature:_Endianness = "little" ;
// group attributes:
} // group ObsValue
However, the ombg
group only contains seaSurfaceTemperature
group: ombg {
variables:
float seaSurfaceTemperature(Location) ;
seaSurfaceTemperature:_FillValue = -3.368795e+38f ;
seaSurfaceTemperature:_Storage = "chunked" ;
seaSurfaceTemperature:_ChunkSizes = 863 ;
seaSurfaceTemperature:_Endianness = "little" ;
// group attributes:
} // group ombg
Do we need to change the logic in method obs_space_stats
in marine_analysis.py
to check ombg
instead of ObsValue
when populating diag_stats.yaml
?
What do you think @guillaumevernieres ? Who on the Marine DA team should I discuss this issue with?
FYI, making the change suggested above to marine_analysis.py
# get the variable name, assume 1 variable per file
nc = netCDF4.Dataset(obsfile, 'r')
##variable = next(iter(nc.groups["ObsValue"].variables))
variable = next(iter(nc.groups["ombg"].variables))
print(f"variable {variable}")
nc.close()
works. With this change gdas_marineanlfinal
passes.
(gdasapp) hercules-login-2:/work/noaa/da/rtreadon/git/global-workflow/pr2992/sorc/gdas.cd/build$ ctest -R test_gdasapp_WCDA-3DVAR-C48mx500_gdas_marineanlfinal_202103241800 Test project /work/noaa/da/rtreadon/git/global-workflow/pr2992/sorc/gdas.cd/build
Start 1968: test_gdasapp_WCDA-3DVAR-C48mx500_gdas_marineanlfinal_202103241800
1/1 Test #1968: test_gdasapp_WCDA-3DVAR-C48mx500_gdas_marineanlfinal_202103241800 ... Passed 80.23 sec
100% tests passed, 0 tests failed out of 1
Label Time Summary:
manual = 80.23 sec*proc (1 test)
Total Test time (real) = 84.94 sec
Another thought: Is the better solution to add seaSurfaceSalinity
to simulated variables
for insitu_surface_trkob
That is should var.yaml
read
obsdataout:
engine:
type: H5File
obsfile: /work/noaa/stmp/rtreadon/HERCULES/RUNDIRS/WCDA-3DVAR-C48mx500/gdas.2021032418/gdasmarineanalysis.2021032418/marinevariational/diags/insitu_surface_trkob.2021032418.nc4
simulated variables:
- seaSurfaceTemperature
- seaSurfaceSalinity
io pool:
max pool size: 1
@RussTreadon-NOAA The letkf problems would be resolved with https://github.com/NOAA-EMC/GDASApp/pull/1372, which adds back the original gridgen yaml under parm, and adds the localization blocks to the obs space config files.
@RussTreadon-NOAA , reverting the obs_list.yaml is what we should do.
@AndrewEichmann-NOAA , thank you for pointing me at GDASApp PR #1372 to me.
PR #1372 places gridgen.yaml
in parm/soca/gridgen/gridgen.yaml
. @DavidNew-NOAA mentioned above that gridgen.yaml
is now in parm/jcb-gdas/algorithm/marine/gridgen.yaml
We don't need gridgen.yaml
in two places. Which location do we go with?
It's good to see that #1372 addresses the missing obs localizations
blocks mentioned above.
@RussTreadon-NOAA @AndrewEichmann-NOAA Let's just leave gridgen.yaml in jcb-gdas and point there for now
@RussTreadon-NOAA @DavidNew-NOAA While it does belong under jcb and the letkf task should be converted to using it, that will require a PR to global-workflow, and letkf will be broken until that PR gets merged.
@AndrewEichmann-NOAA I put the jcb-gdas gridgen.yaml
reference in config.marineanlletkf
here in the this PR in my last commit this morning
@AndrewEichmann-NOAA Sorry, I meant in GW PR #2992
@DavidNew-NOAA Ah, ok
@RussTreadon-NOAA , reverting the obs_list.yaml is what we should do.
Thanks @guillaumevernieres for the guidance. parm/soca/obs/obs_list.yaml
was reverted at 716dcdb
FYI
I am manually running g-w DA CI on Hera & Hercules using g-w branch feature/jcb-obsbias
at 42904ba
with sorc/gdas.cd
populated with GDASApp branch feature/resume_nightly
at 716dcdb
.
test_gdasapp
had 64 out of 64 tests pass on both machines.
test_gdasapp
is currently running on Orion. Pending a 64/64 result I'll launch g-w DA CI on Orion.
Several JEDI repositories have been updated with changes from the Model Variable Renaming Sprint. Updating JEDI hashes in
sorc/
requires changes in GDASApp and jcb-gdas yamls and templates. This issue is opened to document these changes.