dtcenter / METplus

Python scripting infrastructure for MET tools.
https://metplus.readthedocs.io
Apache License 2.0
98 stars 37 forks source link

Update Truth: Based on dtcenter/MET#2838 #2515

Closed JohnHalleyGotway closed 6 months ago

JohnHalleyGotway commented 6 months ago

Describe Expected Changes

Issue https://github.com/dtcenter/MET/issues/2583 and pull request https://github.com/dtcenter/MET/pull/2838 adds new columns to the end of the ECNT line type.

The changes should be limited to only the ECNT line type.

Need to validate the differences flagged in this GHA testing workflow run.

Define the Metadata

Title

Assignee

Assign this issue to the author of the pull request that warranted this issue. Optionally assign anyone else who should review the differences in the output.

Projects and Milestone

Update Truth Checklist

JohnHalleyGotway commented 6 months ago

Checking this GHA testing workflow run, I see that differences are flagged in 3 use case groups across a total of 11 different output files, all output from the Ensemble-Stat tool.

I ran the following commands to confirm that the only source of difference is the 2 new ECNT columns.

for truth in `find ./ -name "*truth*"`; do 
  output=`echo $truth | sed 's/truth/output/g'`;
  egrep "ECNT|VERSION" $truth | sed -r 's/ +/ /g' > truth_ecnt.txt
  egrep -v "ECNT|VERSION" $truth | sed -r 's/ +/ /g' > truth_not_ecnt.txt
  egrep "ECNT|VERSION" $output | sed -r 's/ +/ /g' | cut -d' ' -f1-49 > output_ecnt.txt
  egrep -v "ECNT|VERSION" $output | sed -r 's/ +/ /g' > output_not_ecnt.txt
  echo "+++ $truth +++"
  diff truth_ecnt.txt output_ecnt.txt
  diff truth_not_ecnt.txt output_not_ecnt.txt
done

With this approach of directly diffing the non-ECNT lines and diffing only the 49 ECNT columns common to both, no differences were found. That means that changes are limited to the newly added columns 50 and 51 of the ECNT line type.

@georgemccabe, the develop-ref truth dataset can safely be updated.