epiforecasts / scoringutils

Utilities for Scoring and Assessing Predictions
https://epiforecasts.io/scoringutils/
Other
48 stars 20 forks source link

Check quantile prediction order helper function #895

Open jonathonmellor opened 2 months ago

jonathonmellor commented 2 months ago

Hello!

Have started exploring the development version of this package (rather than on cran). All going well, great work!

Extremely minor helper function request - could the enforcement of the quantile / prediction ordering here be added as a helper function?

In the case where you're not using standard quantiles to generate your intervals + using small sample sizes this condition isn't strictly held, so may be useful as a user check.

Very low priority / impact, but hopefully something to consider.

I suppose there might be a question as to why this behaviour is within the bias_quantile_single_vector, but that's perhaps another matter. Are the quantiles used in this function for the interpolation only? If so, might the function fit better within that section? In my use case I am giving a median prediction, but the bias fails due to issues at my quantile==0.1 values, for example.

jonathonmellor commented 2 months ago

This is probably a "Jon your model is bad" type situation, but just flagging the behaviour.

nikosbosse commented 2 months ago

Ah yes, I remember us previously having discussions on whether or not to enforce this and how to deal with it.

bias_quantile() requires increasing quantiles because you're trying to find the lowest/highest quantile thats above/below the true value. If your quantile levels and quantiles look like this: 0.2 - 4 0.4 - 8 0.6 - 7 0.8 - 9

then it's unclear how to compute bias. Note that it doesn't require your quantile levels to be ordered as long as the corresponding quantiles are increasing. I.e. this is fine: 0.4 - 8 0.2 - 4 0.8 - 9 0.6 - 8.5 (there is this line:

order <- order(quantile_level)
predicted <- predicted[order]

)

Back then we decided to not enforce increasing quantiles generally, and just do it here.

We could (and probably should?) enforce this everywhere. But you can compute the other metrics without that.

How exactly where you thinking about the helper function?

Note: I'd give it like a 30% chance that I misunderstood your intent or misrepresented what's currently happening. I'll dig deeper into this, I'm just currently a bit swamped with things.

jonathonmellor commented 2 months ago

Thanks - that explaination makes a lot of sense as to why it's in there but not more generally!

For context, I am exploring different approaches to generating prediction intervals from samples. In some approaches at small time horizons the estimation of the interval is poor, so doesn't guarantee order in the upper or lower bounds of an interval, at different interval sizes. For example the lower bound on a 90% interval may be higher than the lower bound on a 80% interval (a problem with the method chosen, not the scoring package!).

In this case where I have poorly ordered quantiles, it would be helpful to flag it before hitting the bias_quantile as for the user that's quite nested within other scoringutils functions (from memory I would have run checks -> set forecasting unit -> score before this warning flags).

If quantile prediction order is not to be enforced for all metrics (seems sensible, given its not needed for all metrics), bringing out a helper function to check quantile order, perhaps after setting the forecast units. At the moment if I have dodgy quantile, and try to fix it I either make my own custom checker (fine!) or keep rerunning the scoring to see if it works. Given the algorithms involved whether there was unwanted overlap was stochastic, which was a bit painful.

For the implementation, I think it would work like check_duplicates where it flags the rows where there are problematic records. My quick and dirty dplyr version was:

scoring_ready_data |>
  # perhaps not needed if using forecast_units?
  dplyr::group_by(date, identifier, response_scale) |>
  dplyr::arrange(quantile_level) |>
  dplyr::mutate(diff = c(NA, diff(predicted))) |>
  dplyr::filter(diff < 0)

One could do a lag on the quantile_level which highlights the previous quantile (the one that's actually problematic) or do the difference in a different arranged order.

In summary, completely makes sense why the condition is in there for bias_quantile, would be great to check forecasting units for this via the package before running scoring.

nikosbosse commented 2 months ago

To me it would make sense to throw a warning when this happens when we check the forecast object (probably as part of assert_forecast.forecast_quantile()?). It feels like users should be made aware of the fact that their quantile forecasts are malformed.

If you're willing to give it a go, we'd be very happy about a PR, @jonathonmellor.

If we feel the warning becomes unwieldy, we could also use a warning message similar to this one: https://github.com/epiforecasts/scoringutils/blob/c20e2aa7c2e287a1f6867ef3914f597cdfd3c7ba/R/forecast.R#L496

jonathonmellor commented 2 months ago

I will pencil out some time next week to have a go! Fair warning, while confident in general with R, I'm not experienced with R object classes and data.table

seabbs commented 2 months ago

To me it would make sense to throw a warning when this happens when we check the forecast object (probably as part of assert_forecast.forecast_quantile()?)

I think this sounds like a good idea. This being a stochastic issue sounds very very annoying to me.

Thanks for the report @jonathonmellor!