NOAA-EMC / GDASApp

Global Data Assimilation System Application
GNU Lesser General Public License v2.1
15 stars 31 forks source link

bufr2ioda_tesac_mammals_profiles.py aborts with zero-size array #944

Closed RussTreadon-NOAA closed 8 months ago

RussTreadon-NOAA commented 9 months ago

g-w CI testing found that bufr2ioda_tesac_mammals_profiles.py aborts while processing /scratch1/NCEPDEV/global/glopara/dump/gdas.20210324/00/atmos/gdas.t00z.tesac.tm00.bufr_d.

Below is the traceback message:

  File "/scratch1/NCEPDEV/da/Russ.Treadon/git/global-workflow/work/sorc/gdas.cd/ush/ioda/bufr2ioda/bufr2ioda_tesac_mammals_profiles.py", line 179, in bufr_to_ioda
    logger.debug(f" temp          min, max, length, dtype = {temp.min()}, {temp.max()}, {len(temp)}, {temp.dtype}")
  File "/scratch1/NCEPDEV/da/python/opt/core/miniconda3/4.6.14/envs/gdasapp/lib/python3.7/site-packages/numpy/ma/core.py", line 5701, in min
    axis=axis, out=out, **kwargs).view(type(self))
  File "/scratch1/NCEPDEV/da/python/opt/core/miniconda3/4.6.14/envs/gdasapp/lib/python3.7/site-packages/numpy/core/_methods.py", line 44, in _amin
    return umr_minimum(a, axis, None, out, keepdims, initial, where)
ValueError: zero-size array to reduction operation minimum which has no identity

Line 179 is the temp line below

    logger.debug(f" ... Executing QuerySet: Check BUFR variable generic \
                dimension and type ...")

    # ==================================================
    # Check values of BUFR variables, dimension and type
    # ==================================================
    logger.debug(f" temp          min, max, length, dtype = {temp.min()}, {temp.max()}, {len(temp)}, {temp.dtype}")

print statements were added and confirm that temp is length 0. Initially temp is size 1433. This changes after applying indices_true

    # =======================================
    # Separate marine mammals from TESAC tank
    # =======================================
    logger.debug(f"Creating the mask for marine mammals from TESAC floats based on station ID ...")

    alpha_mask = [item.isalpha() for item in stationID]
    indices_true = [index for index, value in enumerate(alpha_mask) if value]

    # Apply index
    stationID = stationID[indices_true]
    lat = lat[indices_true]
    lon = lon[indices_true]
    depth = depth[indices_true]
    temp = temp[indices_true]

Prints show that alpha_mask contains False for all elements. As a result, indices_true is [] and temp winds up being [].

I interpret the above as a case in which there are no marine mammal observations in the given dump file. Is this true? What is the proper way to handle this situation in bufr2ioda_tesac_mammals_profiles.py

RussTreadon-NOAA commented 9 months ago

Tagging @ShastriPaturi for awareness. What do you recommend?

RussTreadon-NOAA commented 9 months ago

As a test the following check was added to bufr2ioda_tesac_mammals_profiles.py

    # =======================================
    # Separate marine mammals from TESAC tank
    # =======================================
    logger.debug(f"Creating the mask for marine mammals from TESAC floats based on station ID ...")

    alpha_mask = [item.isalpha() for item in stationID]
    indices_true = [index for index, value in enumerate(alpha_mask) if value]
    if len(indices_true) is 0:
        logger.info(f"No marine mammals in {DATA_PATH}")
        return

    # Apply index

With this check present bufr2ioda_tesac_mammals_profiles.py prints the above info message and exits.

(gdasapp) Hera(hfe07):/scratch1/NCEPDEV/stmp2/Russ.Treadon/RUNDIRS/prci/prepatmobs.165454$ /scratch1/NCEPDEV/da/Russ.Treadon/git/global-workflow/work/sorc/gdas.cd/ush/ioda/bufr2ioda/bufr2ioda_tesac_mammals_profiles.py -c tesac_mammals_profiles_2021032400.json
2024-02-29 14:01:48,015 - INFO     - bufr2ioda_tesac_mammals_profiles.py: No marine mammals in /scratch1/NCEPDEV/global/glopara/dump/gdas.20210324/00/atmos/gdas.t00z.tesac.tm00.bufr_d

Not sure if this is an acceptable solution. What do you think @ShastriPaturi ?

RussTreadon-NOAA commented 8 months ago

Partially addressed by PR #937