NOAA-EMC / GDASApp

Global Data Assimilation System Application
GNU Lesser General Public License v2.1
15 stars 31 forks source link

JEDI ATM gfs and gdas prepatmiodaobs breaks with GDASapp hash 33b4cb9 #1006

Closed RussTreadon-NOAA closed 7 months ago

RussTreadon-NOAA commented 7 months ago

The gfs and gdas prepatmioda jobs fail for 20240224 00Z in g-w C96C48_ufs_hybatmDA testing.

Examination of the log files found that the job was trying to access a non-existent file

^[[38;21m2024-03-28 19:31:40,655 - INFO     - run_bufr2ioda.py: Convert insitu_profiles_argo...^[[0m
^[[38;21m2024-03-28 19:31:40,656 - INFO     - gen_bufr2ioda_json.py: Using /work2/noaa/da/rtreadon/git/global-workflow/test/parm/gdas/ioda/bufr2ioda/bufr2ioda_insitu_profiles_argo.json as input {'RUN': 'gdas', 'current_cycle': datetime.datetime(2024, 2, 24, 0, 0), 'DMPDIR': PosixPath('/work/noaa/rstprod/dump'), 'COM_OBS': PosixPath('/work2/noaa/stmp/rtreadon/COMROOT/prtest/gdas.20240224/00/obs'), 'PDY': '20240224', 'cyc': '00'}^[[0m
Traceback (most recent call last):
  File "/work2/noaa/da/rtreadon/git/global-workflow/test/ush/run_bufr2ioda.py", line 119, in <module>
    bufr2ioda(args.current_cycle, args.RUN, args.DMPDIR, args.config_template_dir, args.COM_OBS)
  File "/work2/noaa/da/rtreadon/git/global-workflow/test/ush/python/wxflow/logger.py", line 266, in wrapper
    retval = func(*args, **kwargs)
  File "/work2/noaa/da/rtreadon/git/global-workflow/test/ush/run_bufr2ioda.py", line 70, in bufr2ioda
    gen_bufr_json(config, template, json_output_file)
  File "/work2/noaa/da/rtreadon/git/global-workflow/test/sorc/gdas.cd/ush/ioda/bufr2ioda/gen_bufr2ioda_json.py", line 19, in gen_bufr_json
    bufr_config = parse_j2yaml(template, config)
  File "/work2/noaa/da/rtreadon/git/global-workflow/test/ush/python/wxflow/yaml_file.py", line 180, in parse_j2yaml
    return YAMLFile(data=Jinja(path, data, searchpath=searchpath).render)
  File "/work2/noaa/da/rtreadon/git/global-workflow/test/ush/python/wxflow/yaml_file.py", line 37, in __init__
    self.update(config)
  File "/work2/noaa/da/rtreadon/git/global-workflow/test/ush/python/wxflow/attrdict.py", line 120, in update
    other.update(args[0])
ValueError: dictionary update sequence element #0 has length 1; 2 is required

File /work2/noaa/da/rtreadon/git/global-workflow/test/parm/gdas/ioda/bufr2ioda/bufr2ioda_insitu_profiles_argo.json does not exist. The actual name of the file is bufr2ioda_insitu_profile_argo.json. There is no s in profile.

Script ush/ioda/bufr2ioda/run_bufr2ioda.py constructs observation specific filenames using the name of the bufr2ioda python script in ush/ioda/bufr2ioda. Given this, we need to ensure consistency between the names of the converter scripts in ush/ioda/bufr2ioda and json files in parm/ioda/bufr2ioda.

RussTreadon-NOAA commented 7 months ago

As a test, make the following changes in a working copy of GDASApp develop

  1. ensure all bufr2ioda_insitu_*profile* python scripts in ush/ioda/bufr2ioda are renamed bufr2ioda_insitu_profile_$OBTYPE.py where OBTYPE is argo, bathy, etc.
  2. check for bufrfile existence before attempting to process the bufr dump file. This is done via the following change to bufr2ioda_insitu_profile*py scripts
     bufrfile = f"{cycle_datetime}-{cycle_type}.t{hh}z.{data_format}.tm00.bufr_d"
     DATA_PATH = os.path.join(dump_dir, bufrfile)
    +    if not os.path.isfile(DATA_PATH):
    +        logger.info(f"DATA_PATH {DATA_PATH} does not exist")
    +        return
     logger.debug(f"{bufrfile}, {DATA_PATH}")

    A similar check is found in other bufr2ioda converters.

Rerun gfs and gdas prepatmiodaobs. Both jobs ran to completion for 20240224 00Z. The bufrfile check caught several instances of non-existent dump files.

2024-03-28 20:46:22,269 - INFO     - bufr2ioda_insitu_marinemammals_profiles.py: DATA_PATH /work/noaa/rstprod/dump/2024022400-gdas.t00z.tesac.tm00.bufr_d does not exist
2024-03-28 20:46:22,270 - INFO     - bufr2ioda_insitu_profiles_argo.py: DATA_PATH /work/noaa/rstprod/dump/2024022400-gdas.t00z.subpfl.tm00.bufr_d does not exist
2024-03-28 20:46:22,270 - INFO     - bufr2ioda_insitu_profiles_tesac.py: DATA_PATH /work/noaa/rstprod/dump/2024022400-gdas.t00z.tesac.tm00.bufr_d does not exist
2024-03-28 20:46:22,273 - INFO     - bufr2ioda_insitu_profile_bathy.py: DATA_PATH /work/noaa/rstprod/dump/2024022400-gdas.t00z.bathy.tm00.bufr_d does not exist
2024-03-28 20:46:22,274 - INFO     - bufr2ioda_insitu_surface_altkob.py: DATA_PATH /work/noaa/rstprod/dump/2024022400-gdas.t00z.altkob.tm00.bufr_d does not exist
2024-03-28 20:46:22,276 - INFO     - bufr2ioda_insitu_profiles_glider.py: DATA_PATH /work/noaa/rstprod/dump/2024022400-gdas.t00z.subpfl.tm00.bufr_d does not exist
2024-03-28 20:46:22,280 - INFO     - bufr2ioda_insitu_surface_trkob.py: DATA_PATH /work/noaa/rstprod/dump/2024022400-gdas.t00z.trkob.tm00.bufr_d does not exist

That these dump files do not exist is not surprising. These dump files are not routinely stored in the GDA.

guillaumevernieres commented 7 months ago

My fault @RussTreadon-NOAA , I'm going to move the marine converters somewhere else.