NOAA-GSL / VxIngest

Other
2 stars 0 forks source link

Data gaps in HRRR_OPS #303

Closed randytpierce closed 7 months ago

randytpierce commented 7 months ago

Looking at a long time series for HRRR_OPS it is possible to detect data gaps.

Curve0: HRRR_OPS in unused: HRRR domain, Ceiling (ft) CSI (Critical Success Index) at 500 (ceiling <500 ft), fcst_len: 6h, valid-time: unused, avg: None. Model filtered by: None range: 0 to 60000. Obs filtered by: None range: 0 to 60000.

These CTC's are missing for Ceiling HRRR_OPS apparently for all the domains Sep 1, 2023 18:00 through Nov 13, 2023 18:00 and Nov 15, 2023 19:00 through Nov 26, 2023 16:00 and Nov 27, 2023 23:00 through Nov 29, 2023 14:00 and Dec 13, 2023 16:00 through Dec 14, 2023 06:00

This query SELECT raw fcstValidISO FROM vxdata._default.METAR WHERE type = 'DD' AND docType = 'model' AND subset = 'METAR' AND version = 'V01' AND model = 'HRRR_OPS' --AND region = 'ALL_HRRR' AND fcstLen = 6 AND fcstValidISO >= "2023-08-24T17:00:00" AND fcstValidISO <= "2023-11-30T17:00:00" ORDER BY fcstValidISO; shows the missing data in the model data.

and this query SELECT raw fcstValidISO FROM vxdata._default.METAR WHERE type = 'DD' AND docType = 'obs' AND subset = 'METAR' AND version = 'V01' --AND model = 'HRRR_OPS' --AND region = 'ALL_HRRR' --AND fcstLen = 6 AND fcstValidISO >= "2023-08-24T17:00:00" AND fcstValidISO <= "2023-11-30T17:00:00" ORDER BY fcstValidISO; shows that the obs are there.

randytpierce commented 7 months ago

The way to proceed is to look at the archive logs on adb-cb1 ... /data-ingest/data/xfer/archives for a time when the data is missing, and look for an error message in the log.

randytpierce commented 7 months ago

Looking into a recent archive for grib rap_ops processing - i.e. vim /data-ingest/data/xfer/archive/success-job_v01_metar_ctc_visibility_model_ops_23a78b1af5ec_1706191807.tar.gz and cycling through to the rap_ops entry 20240125141007/job_v01_metar_grib2_model_rapops130-2024-01-25T14:10:07.log You can find

Traceback (most recent call last):
  File "/app/vxingest/grib2_to_cb/grib_builder_parent.py", line 727, in build_document
    "Vegetation Type": ds_surface_vegetation_type.variables[
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.11/site-packages/xarray/core/utils.py", line 455, in __getitem__
    return self.mapping[key]
           ~~~~~~~~~~~~^^^^^
KeyError: None

which indicates the key error problem. I saved /opt/data/grib2_to_cb/rap_ops_130/input_files/2402215000001 into the /opt/data test data set in order to reproduce this problem locally in vscode - for debugging. Should be able to step through to line 727 and find the problem.

I am also seeing ERRORS in the partial SUMS builder output, although these do not stop the documents with the correct data from being generated and imported.

2024-01-25T15:16:58+0000 [INFO] <VxIngestManager-35> (vxingest.partial_sums_to_cb.partial_sums_builder): PartialSumsSurfaceModelObsBuilderV01.build_document queue_element:MD:V01:METAR:RAP_OPS_130:E_US:SUMS:SURFACE:ingest model:RAP_OPS_130 region:E_US variable:surface subset:METAR
2024-01-25T15:16:59+0000 [INFO] <VxIngestManager-36> (vxingest.partial_sums_to_cb.partial_sums_builder): Looking up model document: DD:V01:METAR:HRRR_OPS:1703008800:1
2024-01-25T15:16:59+0000 [INFO] <VxIngestManager-33> (vxingest.partial_sums_to_cb.partial_sums_builder): Looking up model document: DD:V01:METAR:HRRR_OPS:1703008800:1
2024-01-25T15:16:59+0000 [INFO] <VxIngestManager-36> (vxingest.partial_sums_to_cb.partial_sums_builder): Looking up observation document: DD:V01:METAR:obs:1703008800
2024-01-25T15:16:59+0000 [INFO] <VxIngestManager-31> (vxingest.partial_sums_to_cb.partial_sums_builder): Looking up model document: DD:V01:METAR:HRRR_OPS:1703008800:1
2024-01-25T15:16:59+0000 [INFO] <VxIngestManager-33> (vxingest.partial_sums_to_cb.partial_sums_builder): Looking up observation document: DD:V01:METAR:obs:1703008800
2024-01-25T15:16:59+0000 [INFO] <VxIngestManager-31> (vxingest.partial_sums_to_cb.partial_sums_builder): Looking up observation document: DD:V01:METAR:obs:1703008800
2024-01-25T15:16:59+0000 [INFO] <VxIngestManager-34> (vxingest.partial_sums_to_cb.partial_sums_builder): Looking up model document: DD:V01:METAR:HRRR_OPS:1703008800:1
2024-01-25T15:16:59+0000 [INFO] <VxIngestManager-32> (vxingest.partial_sums_to_cb.partial_sums_builder): Looking up model document: DD:V01:METAR:HRRR_OPS:1703008800:1
2024-01-25T15:16:59+0000 [ERROR] <VxIngestManager-36> (vxingest.partial_sums_to_cb.partial_sums_builder): PartialSumsSurfaceModelObsBuilderV01 handle_sum: Exception :  error: 'UW'
2024-01-25T15:16:59+0000 [ERROR] <VxIngestManager-36> (vxingest.partial_sums_to_cb.partial_sums_builder): PartialSumsSurfaceModelObsBuilderV01 handle_sum: Exception :  error: 'VW'
2024-01-25T15:16:59+0000 [INFO] <VxIngestManager-36> (vxingest.partial_sums_to_cb.partial_sums_builder): PARTIALSUMSBuilder.handle_document - adding document DD:V01:METAR:HRRR_OPS:E_US:SUMS:SURFACE:1703008800:1

This should be easy to reproduce in a test case using the ingest metadata document

MD:V01:METAR:RAP_OPS_130:E_US:SUMS:SURFACE:ingest model:RAP_OPS_130 region:E_US variable:surface subset:METAR
randytpierce commented 7 months ago

This has been fixed and merged.