dtcenter / MET

Model Evaluation Tools
https://dtcenter.org/community-code/model-evaluation-tools-met
Apache License 2.0
78 stars 24 forks source link

Consider enhancing MET tools to allow for the definition of thresholds as percentiles rather than as explicit values. #76

Closed dwfncar closed 5 years ago

dwfncar commented 13 years ago

This task is to support two types of percentile thresholds.


The first type is when you select an absolute threshold for either the forecast or observation field. For the other, you set a bias correction flag to telling it to select a threshold of the same type as the other but pick to value resulting in a bias as close to one as possible. This flag can be set in at most one of the fields. The FCST_THRESH and OBS_THRESH output columns should contain the actual threshold values used.


The second type is when you select a percentile threshold for the forecast and/or observation fields. The percentile thresholds are re-computed for each verification task. When percentile thresholds are used the FCST_THRESH and OBS_THRESH columns should contain something like ">P90".


Here are two additional issues:




These issues were discussed with Marion Mittermaier on Aug 1, 2014.


[MET-76] created by johnhg

dwfncar commented 13 years ago

On 10/31/2011, the Mesoscale Modelling task, led by Jamie Wolff, met and discussed the idea of using percentile thresholds in the context of fractions skill score. In the following paper, percentile thresholds are used to compute FSS with a frequency bias of 1:
  http://journals.ametsoc.org/doi/full/10.1175/2007MWR2123.1

In this meeting, two variants of how a percentile threshold could be defined were discussed:
(1) Treat the forecast and observation fields separately. In each field, choose the threshold that corresponds to percentile of interest. Handling them separately ensures that FBIAS = 1.
(2) Alternatively, choose an explicit threshold for one field. Convert that threshold into a percentile value, and then apply that percentile threshold to the other field. e.g. Choose 1" APCP threshold for the obs field, and then let the forecast threshold vary so that FBIAS = 1.

Two big issues are:
(1) When applying a percentile threshold, the explicit threshold value will vary based on the smoothing method and masking region applied. So the actual threshold value will need to be computed for each verification task.
(2) How should this threshold information be written out in the 21 header columns in the MET output? by johnhg

dwfncar commented 10 years ago

This was discussed during the GFS/NAM paper review on 6/11/14. Reviewers suggested that we remove the bias entirely before computing statistics. So the priority for this task has increased. by johnhg

dwfncar commented 10 years ago

For the second method, suggest adding a configurable option to the field setting of the config file - something like "debias_flag". For each verification task, this flag can only be set for at most one of the forecast and observation fields. Accumulate the matched pairs, apply the threshold for the field where the flag is false. Then call some GSL functions to determine the threshold for the other field. Print a log message stating the resulting bias value and print a warning if that bias does not fall within some range. See attached Rscript for an example of doing this for GFS/NAM. by johnhg

dwfncar commented 10 years ago

Example of computing percentile thresholds using the NetCDF matched pair output from Grid-Stat. by johnhg

dwfncar commented 8 years ago

We have a deliverable for SBU to provide this so please charge 775038 (note: check with Tara or Paul to confirm account key - you never know... it might change) by jensen

dwfncar commented 8 years ago

To be paid for by SBU and/or USWRP projects.

by johnhg

dwfncar commented 6 years ago

On 11/05/2018, this was again requested by the regional ensembles group for FY2018. by johnhg

dwfncar commented 5 years ago

Discussion with Lindsay on 2/19/2019 about USWRP.


(1) Let the user choose a percentile threshold independently for the fcst and obs datasets. For example, choose P90, P95, or P99 for both, use them to define events and compute categorical statistics.


Issues/questions:
Compute the percentile over the current set of matched pairs (i.e. the percentile of the sample).
Recommend that the FCST_THRESH and OBS_THRESH columns be formatted as:
"Threshold Type"SP"Percentile Chosen"("Actual Percentile Value")
Where...
 - Threshold Type is <, <=, ==, !=, >=, >, lt, le, eq, ne, ge, gt
 - SP indicates that this is the "Sample Percentile"
 - Percentile Chosen is an integer between 0 and 100
 - Actual Percentile Value is the value of the raw field corresponding to this percentile
For example: >=SP75(3.45)
With this, we'll need to enhance the MET tools to write (STAT tools) and read (Stat-Analysis) this format. We also need to enhance METviewer to parse this during loading.
We should also consider...
 (a) Adding new header columns for FCST_PERC_VALUE and OBS_PERC_VALUE but this seems like overkill.
 (b) Adding a new line type to store the actual percentile values.
              
Would like to enhance METviewer to have a way to exclude cases where some minimum threshold was not met. For example, when using P99 threshold on reflectivity, only include in the aggregation, cases where the obs P99 threshold value was >20. New METviewer development.


(2) Let the user choose a real value threshold for the fcst or obs and let MET choose the other threshold which would result in an unbiased comparison. Sounds like this hasn't been discussed for USWRP but we should discuss more thoroughly.
For example in the config file, use something like this:
   fcst = { cat_thresh = [ ==OBS, ==OBS ]; }
   obs = { cat_thresh = [ >=12.7, >=25.4 ]; }


(3) The third variant is defining percentile thresholds separately for each grid point relative to climatology. For example, FCST_THRESH = >=CP75 means...
For each grid point, get the climatological mean and standard deviation. Assuming normality, determine the 75th percentile of that distribution. Use that value to define the threshold (CP75) for that grid point. We likely wouldn't do this work for this JiRA issue, but it's related.
SP = sample percentile
CP = climatology percentile


Additional ideas to consider...

dwfncar commented 5 years ago

Notes from meeting on 02/20/2019:


(4) This is really option number 4. Have the user specify both the percentile used and the value to which that corresponds.

dwfncar commented 5 years ago

Email to Burkley:


Thanks for sending the FORTRAN code. So it's pretty clear that you've specified a list of dates and run this code to compute percentiles over some period (i.e. season or experiment). Each MET tool is run at a single point in time... so the only data is know about is the sample for that day. So there's a lot of details to clarify in the area of percentile thresholding. Here's are the current variants we've discussed:


(1) Enhance Grid-Stat and Point-Stat to compute percentiles from the current sample of data.
In the config file, support thresholds like: cat_thresh = [ >=SFP75 ];
Where SFP means "Sample-Forecast-Percentile". Write to the MET output ">=SFP75(2.8)" where the value inside the parenthesis is the 75th percentile of the forecast values for the current verification task. If you verify over different masking regions or using different smoothing methods, those values would change.
Also support "SOP" for Sample-Observation-Percentile and "SCP" for Sample-Climatology-Percentile, if a climo-mean is provided.


(2) Enhance Grid-Stat and Point-Stat to automatically bias-correct on the fly.
The user picks a real threshold, probably for the obs, like: cat_thresh = [ >12.7 ]; to threshold at 0.5" of precip.
In the other field, they set: cat_thresh = [ FBIAS1 ];
That triggers logic in Grid-Stat and Point-Stat to determine what threshold value should be used for the other field so that we get a frequency bias of 1.
Whatever those actual thresholds are, write them to the FCST_THRESH and OBS_THRESH columns of the output.


(3) Create a new utility in MET to help compute percentiles across multiple files just as you've done in your FORTRAN code. This tool will read one or more input gridded data files, optionally apply a user-specified masking area, and report the user-requested percentile values. Then take those values and specify them in the Grid-Stat and Point-Stat config files as: cat_thresh = [ >=UDP75(3.58) ]; where UDP means User-Defined-Percentile and 3.58 is what that percentile value actually is.


Ultimately, we need to work up the details of (3) into a METplus use-case. Since we're string 2 tools together, we need to put it in a use-case.


We want to make this new formatting convention for the output FCST_THRESH and OBS_THRESH columns consistent so that METviewer can be enhanced to parse this info and do sensible things with it.


(4) There is a 4th variant that is related but slightly different. Rather than computing percentiles summarized over space, compute them separately for each grid point. You could either do that using the Series-Analysis tool. Or you could do that by specifying a climatological mean and standard deviation. These climatology-based thresholds would vary from grid point to grid point. But this would mostly be useful for global data where climatology is a lot more relevant. We'd write these thresholds to the FCST_THRESH and OBS_THRESH output as ">CP75" for Climatology-Percentile.


Do you have any comments or feedback about how this would apply to your work?

by johnhg

dwfncar commented 5 years ago

John should charge 277053 (Regional Ensemble) for this work. Charge 56 hours prior to May 3, 2019 and Tatiana should charge 30 hours by May 3, 2019 as well. by johnhg

dwfncar commented 5 years ago

Support the following variants:


In the config file, user requests:
   >SFP75 ... sample forecast percentile ... and write to the output ">SFP75(2.5)".
   >SOP75 ... sample observation percentile ... and write to the output ">SOP75(2.5)".
   >SCP75 ... sample climatology percentile ... and write to the output ">SCP75(2.5)".
   >USP75(2.5) ... user-specified percentile ... and write to the output ">UDP75(2.5)".


Or requests de-biasing:
   obs.cat_thresh = >2.5 ... and write to the output ">2.5".
   fcst.cat_thresh = FBIAS1 ... to debias the forecast and write to the output ... "FBIAS1".


For later...
   >CDP75 ... climatological distribution percentile (differs grid point by grid point)

by johnhg

dwfncar commented 5 years ago

NOAA-GSD requested that we make the percentile warning tolerance configurable. by johnhg

dwfncar commented 5 years ago

We should have Tressa and Randy discuss/clarify the actual percentile method being employed here. by johnhg

dwfncar commented 5 years ago

Merged changes from met_feature_76_perc_thresh into the trunk on 4/15. Changes for gen_vx_mask, point_stat, grid_stat, wavelet_stat, ensemble_stat, mode, and stat_analysis.


Still need to...
(1) Update the documentation.
(2) Add a new unit_perc_thresh.xml. by johnhg

dwfncar commented 5 years ago

Added support in met-8.1 to handle these variants:

SFP75
SOP75
SCP75
USP75(2.5)
==FBIAS1
And added unit tests for this. by johnhg