dtcenter / METplus

Python scripting infrastructure for MET tools.
https://metplus.readthedocs.io
Apache License 2.0
97 stars 37 forks source link

Update Truth: For dtcenter/MET#2942 #2656

Closed JohnHalleyGotway closed 2 weeks ago

JohnHalleyGotway commented 3 weeks ago

Describe Expected Changes

  1. In MPR line type:
    • Renames columns CLIMO_MEAN, CLIMO_STDEV, and CLIMO_CDF as OBS_CLIMO_MEAN, OBS_CLIMO_STDEV, and OBS_CLIMO_CDF, respectively.
    • Adds new columns FCST_CLIMO_MEAN and FCST_CLIMO_STDEV.
  2. In ORANK line type:
    • Renames columns CLIMO_MEAN, CLIMO_STDEV as OBS_CLIMO_MEAN, OBS_CLIMO_STDEV, respectively.
    • Adds new columns FCST_CLIMO_MEAN and FCST_CLIMO_STDEV.
  3. Makes similar name changes in gridded NetCDF output files from Grid-Stat.

Define the Metadata

Title

Assignee

Assign this issue to the author of the pull request that warranted this issue. Optionally assign anyone else who should review the differences in the output.

Milestone and Projects

Update Truth Checklist

JohnHalleyGotway commented 3 weeks ago

I inspected the many differences flagged in this GitHub Actions workflow run (and see Attempt #2 as well). Differences are flagged in output from 8 of the use case groups:

  1. Use Case Tests (met_tool_wrapper:0-29,59-64):

    • diff-use_cases_met_tool_wrapper_0-29_59-64 has diffs in 2 Ensemble-Stat output files (_orank.txt and .stat):
    • PASS: The ORANK header column names are updated and the number of columns is increased by 2, as expected.
  2. Use Case Tests (data_assimilation:0-1):

    • diff-use_cases_data_assimilation_0-1 has diffs in 2 files ASCII .out files with MPR lines:
    • FAIL: In data_assimilation/StatAnalysis_fcstGFS_HofX_obsIODAv2_PyEmbed/StatAnalysis_IODAv2/dump_output.out, the MPR header line has 39 columns while the data lines only have 36. Note that the truth data has a similar problem with 37 header columns and 36 data columns.
    • Same problem exists in StatAnalysis_fcstHAFS_obsPrepBufr_JEDI_IODA_interface/model_applications/data_assimilation/StatAnalysis_HofX/dump_output.out
    • Need to update the read_iodav2_mpr.py python embedding script to add 3 columns of na to the end of each MPR line. These changes have been made on the feature_2656_update_truth branch.
  3. Use Case Tests (marine_and_cryosphere:3-5)

    • diff-use_cases_marine_and_cryosphere_3-5 has diffs in 3 Grid-Stat NetCDF matched pairs output files.
    • PASS: CLIMO_MEAN_ssh_SURFACE_FULL variables are replaced by FCST_CLIMO_MEAN_ssh_SURFACE_FULL and OBS_CLIMO_MEAN_ssh_SURFACE_FULL. I used ncview to confirm that the data is the same. This change is expected.
  4. Use Case Tests (pbl:0)

    • diff-use_cases_pbl_0 has diffs in 1 Point-Stat .stat file.
    • PASS: The only diff is adding 2 new columns to 5 MPR output lines.
  5. Use Case Tests (s2s:4)

    • diff-use_cases_s2s_4 has diffs in 29 Grid-Stat NetCDF matched pairs files which were used as input to Series-Analysis.
    • PASS: I spot-checked these files to confirm that the only changes are adding FCST_CLIMO_MEAN and FCST_CLIMO_STDEV variables and updating the names for the OBS_CLIMO_MEAN, OBS_CLIMO_STDEV, and OBS_CLIMO_CDF variables. I used ncview to confirm that the data looks the same.
  6. Use Case Tests (short_range:0)

    • diff-use_cases_short_range_0 has diffs in 6 Ensemble-Stat _orank.txt and .stat output files.
    • PASS: I spot-checked these files to confirm that the ORANK header column names are updated and the number of columns is increased by 2, as expected.
  7. Use Case Tests (tc_and_extra_tc:0-2)

    • diff-use_cases_tc_and_extra_tc_0-2 has diffs in 2 Point-Stat ASCII output files (_mpr.txt and .stat).
    • PASS: I confirmed that the only diffs are modified MPR column names and 2 new columns in the header and data rows.
  8. Use Case Tests (unstructured_grids:0)

    • diff-use_cases_unstructured_grids_0 has diffs in 1 Stat-Analysis output file.
    • FAIL: This the same problem described in 2 where the number of MPR header columns (39) and data columns (36) do not match. There's a python embedding script passing MPR lines to Stat-Analysis that needs to be updated based on the changes to the MPR line type.
    • Need to update the ugrid_lfric_mpr.py python embedding script to add 3 columns of na to the end of each MPR line. These changes have been made on the feature_2656_update_truth branch.

In Attempt #1, 4 other use case groups failed but DID NOT produce a diff artifact. However those failures disappeared in Attempt #2, so I assume them to be sporadic, passing problems similar to the connections problems shown below:

requests.exceptions.ConnectionError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))
ls: cannot access '/home/runner/work/METplus/diff': No such file or directory