Closed dwfncar closed 7 years ago
Should consider if this functionality should live in METViewer only or also in STAT-Analysis. However, it would be a shame to develop duplicate functionality. by johnhg
This has become funded - via NGGPS funds - and will be a priority for release this fall. We should explore if GSD node could help with this. by jensen
This will require a significant level of effort and more detailed specification.
During John and Tara's visit to NCEP in November 2014, NCEP requested that they'd like an easy way to generate a "scorecard" to summarize the performance of one model against another. This is very similar to the significance tables produced by the DTC mesoscale modelling group by post-processing the points files created by METViewer when plotting. This would require a new METViewer plot template type in which a user would select exactly two models to compare, followed by a list of the table elements (variables/levels/statistic) to be used in the table generation. Typically these tables are created by lead time, so the user would specify the lead times of interest. However, you might plot against something else, like vx_mask for 14 subregions or vertical level for upper-air plots. This would basically be the independent variable.
Define a top-level fixed values section that applies to all variable/level/statistic choices, but also allow for additional fixed values for each element. For example, you might include GSS for 24-hour precip thresholded >25.4, but also include GSS thresholded >=50.8. Those thresholds would be set in the fixed data section for each element.
For each element, query the METViewer database for the two specified models, apply event equalization, and compute a pairwise difference. Threshold the p-value from the significance test for the pairwise difference to indicate which model is better. NCEP uses a large green upward triangle for superior performance at the 99.9% level, a small green upward triangle for superior performance at the 99% level, a shaded green box for superior performance at the 95% level, and a gray box for no significant difference in performance. Worse performance is plotted using downward pointing arrows and shading in red.
Ideally, the user could define how to threshold the p-values to indicate better/worse performance and specify what symbols/colors should be used.
An example may be found at: http://www.emc.ncep.noaa.gov/gmb/wx24fy/vsdb/gfs2016/ [MET-462] created by johnhg