epiforecasts / scoringutils

Utilities for Scoring and Assessing Predictions
https://epiforecasts.io/scoringutils/
Other
48 stars 20 forks source link

Move computation of p-values out of `get_pairwise_comparisons()`? #750

Open nikosbosse opened 5 months ago

nikosbosse commented 5 months ago

Currently, three functions exist that do something related to pairwise comparisons:

Should the calculation of p-values and mean score ratios/relative skill scores be done by the same function?

Pro:

Contra:

In terms of currently suggested workflows we have the following:

For getting relative skill scores, you call as_forecast(data) |> score() |> add_relative_skill().

For visualising mean score ratios, you call

pairwise <- example_quantile |>
  as_forecast() |>
  score() |>
  get_pairwise_comparisons() 

plot_pairwise_comparisons(pairwise)

For visualising p-values, you call

plot_pairwise_comparisons(pairwise, type = "pval")

We previously even had a nice plot that showed both p-values and mean score ratios in a single plot (using the upper and lower triangle), but that broke and we ditched a while ago.


Options:

  1. leave everything as is for now.
  2. remove computation of p-values for now. Maybe rename get_pairwise_comparisons() to get_score_ratios(). Re-introduce functionality later.
  3. do a rewrite with two different workflows before the next CRAN release of version 2.0.0
  4. other
nikosbosse commented 5 months ago

@sbfnk @seabbs @nickreich @elray1 maybe you have thoughts or preferences as well?

sbfnk commented 5 months ago
  1. Could get_pairwise_comparisons() have a metric (or the like) option which could be mean_score_ratio (default) or p_value? The plot function could then plot whichever is there.
nikosbosse commented 5 months ago
  1. Could get_pairwise_comparisons() have a metric (or the like) option which could be mean_score_ratio (default) or p_value? The plot function could then plot whichever is there.

It would get two metric arguments then :). My intuition is to prefer the status quo over that proposal for the following reasons:

seabbs commented 5 months ago

I think my preference is 1 (i.e do nothing)

nikosbosse commented 4 months ago

ok. Moving this to a later release then.