Feature Request: Validation of First, Last, Min, Max, Mean forecasts

ktindiana / sphinxval

SPHINX validation code for solar energetic particle models

MIT License

3 stars 3 forks source link

Feature Request: Validation of First, Last, Min, Max, Mean forecasts #17

Closed ktindiana closed 9 months ago

ktindiana commented 1 year ago

From Leila: Metrics for one forecast per event (earliest and/or best and/or latest) which gives different information and more readily understandable regarding performance for hits and misses - and I think this is most relevant to what the operator/analyst sees in real-time. Other interesting metrics could be found for different types of forecast subsets, such as dropping the "not clear" subset for UMASEP and other ongoing forecasts for events. Other subsets for exploring the correct rejections and false alarm performance might tell us what is going on for each model, the source and explanation for the metrics.

rickyegeland commented 12 months ago

Another subset could be the "max" (e.g. highest probability, highest peak flux) that matches an event. When monitoring the scoreboard I tend to pick those out.

rickyegeland commented 12 months ago

Other subsets for exploring the correct rejections and false alarm performance might tell us what is going on for each model, the source and explanation for the metrics.

This is not clear. What kind of subset would give more insight?

ktindiana commented 9 months ago

This feature request will be considered the implementation of metrics for:

First forecast (including AWT regardless of whether the forecast changed to clear prior to the SEP event)
Last forecast before the observed value
Maximum forecast
Minimum forecast
Average forecast (all forecasts for a particular observation averaged together and then assessed)

ktindiana commented 9 months ago

Implemented feature in the code to calculate First, Last, Max, Mean forecast assessment when appropriate for different quantities. Decided not to include Min.

The "First" feature applied to All Clear results in a contingency table with one forecast for each observed SEP event that fell inside a model's prediction window. This indicates whether a model hit or missed a given SEP event. This contingency table cannot assess the Correct Negatives.

Metrics calculated for First, Last, Max, or Mean are only relevant for times when an SEP event was observed. All forecasts outside of an observed SEP event period are not included. They should be interpreted as - "When an observed SEP event occurred, the model performed..."