USGS-R / protoloads

Prototyping and exploring options for broad-scale load forecasting
0 stars 4 forks source link

fig/tbl: contingency table, false alarm ratio, other forecast skill metrics #59

Closed aappling-usgs closed 6 years ago

aappling-usgs commented 6 years ago

goal is to evaluate the model's ability to correctly forecast threshold exceedance events. will need to make up some arbitrary threshold (be consistent with #58)

maybe show one contingency table, many scores broken up by month or high/low flow or lead time bin

options for metrics: http://www.eumetrain.org/data/4/451/english/msg/ver_categ_forec/uos3/uos3_ko1.htm, https://en.wikipedia.org/wiki/Forecast_skill

whiteboarded notes: img_20180504_110535126

jordansread commented 6 years ago

Cool!

jzwart commented 6 years ago

I'm using the Heidke Skill Score to assess model performance for forecasting a threshold exceedance ('yes' or 'no') that we arbitrarily set for nitrate flux. some notes on Heidke Skill Score (HSS):

The HSS measures the fractional improvement of the forecast over the standard forecast. Like most skill scores, it is normalized by the total range of possible improvement over the standard, which means Heidke Skill scores can safely be compared on different datasets. The range of the HSS is -∞ to 1. Negative values indicate that the chance forecast is better, 0 means no skill, and a perfect forecast obtains a HSS of 1.

For the first two panels, HSS improves as lead time decreases , the last panel shows that HSS is highest at pretty far out lead times, which is similar to the relative flux error plots #55. The last panel also has a HSS above 0 for all lead times, which I think indicates that for this river (Mississippi R.) the forecasts have some skill at least better than chance.

image

jordansread commented 6 years ago

I really like this updated way of displaying the model skill. It seems clearer to me vs the boxplot one.

aappling-usgs commented 6 years ago

Agreed, this is cool!