epiforecasts / scoringutils

Utilities for Scoring and Assessing Predictions
https://epiforecasts.io/scoringutils/
Other
47 stars 19 forks source link

Decide on whether to keep the digits argument in `get_correlations()` #821

Closed nikosbosse closed 1 month ago

nikosbosse commented 1 month ago

As discussed in #791: Should we keep the digits argument in get_correlations()?

image

full function code:

#' @title Calculate correlation between metrics
#'
#' @description
#' Calculate the correlation between different metrics for a data.frame of
#' scores as produced by [score()].
#'
#' @param metrics A character vector with the metrics to show. If set to
#'   `NULL` (default), all metrics present in `scores` will be shown.
#' @param digits A number indicating how many decimal places the result should
#'   be rounded to. By default (`digits = NULL`) no rounding takes place.
#' @inheritParams get_pairwise_comparisons
#' @param ... Additional arguments to pass down to [cor()].
#' @return
#' An object of class `scores` (a data.table with an additional
#' attribute `metrics` holding the names of the scores) with correlations
#' between different metrics
#' @importFrom data.table setDT
#' @importFrom stats cor na.omit
#' @importFrom cli cli_warn
#' @importFrom checkmate assert_subset
#' @export
#' @keywords scoring
#' @examples
#' scores <- score(as_forecast(example_quantile))
#' get_correlations(scores, digits = 2)
get_correlations <- function(scores,
                             metrics = get_metrics(scores),
                             digits = NULL,
                             ...) {
  scores <- ensure_data.table(scores)
  assert_subset(metrics, colnames(scores), empty.ok = FALSE)
  df <- scores[, .SD, .SDcols = names(scores) %in% metrics]

  # define correlation matrix
  cor_mat <- cor(as.matrix(df), ...)

  if (!is.null(digits)) {
    cor_mat <- round(cor_mat, digits)
  }

  correlations <- new_scores(
    as.data.frame((cor_mat)),
    metrics = metrics,
    keep.rownames = TRUE
  )
  correlations <- copy(correlations)[, metric := rn][, rn := NULL]

  return(correlations[])
}

# define function to obtain upper triangle of matrix
get_lower_tri <- function(cormat) {
  cormat[lower.tri(cormat)] <- NA
  return(cormat)
}
nikosbosse commented 1 month ago

@seabbs @sbfnk @kathsherratt @toshiakiasakura @nickreich what do you think about this?

seabbs commented 1 month ago

I think I would vote for killing this

nikosbosse commented 1 month ago
image

@sbfnk what do you think?

sbfnk commented 1 month ago

It should go.

nikosbosse commented 1 month ago

No one likes the poor users here at scoringutils...

The people have spoken then!

seabbs commented 1 month ago

scoringutils being designed for a cat really explains a lot 😉

nikosbosse commented 1 month ago

Would anyone of you be in favour though to add the digits argument back to the plot_correlations function? I think ultimately you'd always want to plot rounded values. (Ironically, I think we originally had it there and then moved it to correlations())

toshiakiasakura commented 1 month ago

... (I will vote for getting it away.)

sbfnk commented 1 month ago

Would anyone of you be in favour though to add the digits argument back to the plot_correlations function? I think ultimately you'd always want to plot rounded values. (Ironically, I think we originally had it there and then moved it to correlations())

I think there's a better rationale there.