dtcenter / MET

Model Evaluation Tools
https://dtcenter.org/community-code/model-evaluation-tools-met
Apache License 2.0
77 stars 24 forks source link

Bugfix: Fix Stat-Analysis errors for jobs using the `-dump_row` option and the `-line_type` option with VCNT, RPS, DMAP, or SSIDX #2888

Closed JohnHalleyGotway closed 4 months ago

JohnHalleyGotway commented 4 months ago

Describe the Problem

This bug arose via dtcenter/METplus#2583. Recommend fixing it in the develop branch for inclusion in the MET-12.0.0 release.

The STATAnalysisJob::dump_stat_line(...) function in stat_job.cc writes output to the file specified by the -dump_row command line option. When -line_type specifies exactly one line type to be written, the header columns for that specific line type are written to the first line of that -dump_row file. Logic is missing for writing header columns for the VCNT, WDIR, RPS, DMAP, and SSIDX line types. Note that WDIR and SSIDX are only written by Stat-Analysis itself. Need to check whether Stat-Analysis can actually read those lines back in as input.

Here's an example error message that's written:

ERROR  : dump_stat_line() -> unexpected line type value VCNT

Expected Behavior

Update Stat-Analysis to support -dump_row dump.stat -line_type ABC for all ABC line types that exist.

Relevant Deadlines

List relevant project deadlines here or state NONE.

Funding Source

Define the source of funding and account keys here or state NONE.

Define the Metadata

Assignee

Labels

Milestone and Projects

Define Related Issue(s)

Consider the impact to the other METplus components.

Bugfix Checklist

See the METplus Workflow for details.

JohnHalleyGotway commented 4 months ago

Note that WDIR is not a true STAT line type. It is written to the output of Stat-Analysis (similar to -job summary) but it is not formatted as a STAT line type with header columns that can be loaded into a METplus database. As such, there is no need or requirement that WDIR output be read back in as input to Stat-Analysis.

I performed the following testing on seneca:

cd /d1/personal/johnhg/MET/MET_development/MET-develop
 ./test_sa.sh > test_sa.log 2>&1

With cat test_sa.sh:

SA_NB=/d1/projects/MET/MET_regression/develop/NB20240515/MET-develop/bin/stat_analysis
SA_FIX=/d1/personal/johnhg/MET/MET_development/MET-develop/bin/stat_analysis

TEST_OUTPUT=/d1/projects/MET/MET_regression/develop/NB20240515/MET-develop/test_output

for TYPE in `echo "VCNT RPS DMAP SSIDX"`; do
  $SA_NB -job filter -line_type ${TYPE} -dump_row dump_${TYPE}.txt \
    -lookin $TEST_OUTPUT/met_test_scripts/grid_stat/grid_stat_120000L_20050807_120000V_dmap.txt \
    -lookin $TEST_OUTPUT/grid_stat/grid_stat_GRIB2_NAM_RTMA_NP2_120000L_20120409_120000V_vcnt.txt \
    -lookin $TEST_OUTPUT/met_test_scripts/ensemble_stat/ensemble_stat_20100101_120000V_rps.txt \
    -lookin $TEST_OUTPUT/ref_config/stat_analysis/sfc_ss_index_by_option.stat
  $SA_FIX -job filter -line_type ${TYPE} -dump_row dump_${TYPE}.txt \
    -lookin $TEST_OUTPUT/met_test_scripts/grid_stat/grid_stat_120000L_20050807_120000V_dmap.txt \
    -lookin $TEST_OUTPUT/grid_stat/grid_stat_GRIB2_NAM_RTMA_NP2_120000L_20120409_120000V_vcnt.txt \
    -lookin $TEST_OUTPUT/met_test_scripts/ensemble_stat/ensemble_stat_20100101_120000V_rps.txt \
    -lookin $TEST_OUTPUT/ref_config/stat_analysis/sfc_ss_index_by_option.stat
done 

Writing a dump row file for -line_type of VCNT, RPS, DMAP, and SSIDX all result in an error:

ERROR  : dump_stat_line() -> unexpected line type value VCNT
ERROR  : dump_stat_line() -> unexpected line type value RPS
ERROR  : dump_stat_line() -> unexpected line type value DMAP
ERROR  : dump_stat_line() -> unexpected line type value SSIDX

But with the changes on my bugfix branch, they all produce non-zero output:

ls -lah dump_*.txt
-rw-rw-r-- 1 johnhg rap  17K May 15 22:54 dump_DMAP.txt
-rw-rw-r-- 1 johnhg rap 6.5K May 15 22:54 dump_RPS.txt
-rw-rw-r-- 1 johnhg rap 1.1K May 15 22:54 dump_SSIDX.txt
-rw-rw-r-- 1 johnhg rap  19K May 15 22:54 dump_VCNT.txt
-rw-rw-r-- 1 johnhg rap    0 May 15 22:44 dump_WDIR.txt