epiforecasts / scoringutils

Utilities for Scoring and Assessing Predictions
https://epiforecasts.io/scoringutils/
Other
48 stars 20 forks source link

pairwise_comparison() outputs both NA and NaN #140

Closed Bisaloo closed 2 years ago

Bisaloo commented 2 years ago

Reprex

d <- tempfile()

download.file(
  "https://gist.githubusercontent.com/Bisaloo/62615a3821da71b8f542f5d5bc096327/raw/1415d04901e8168ceed2ca1f554b8799cf3ba6ac/reprex_NA_NaN_scoringutils.txt",
  d
)

df <- dget(d)

library(tidyverse)
library(scoringutils)
#> Note: The definition of the weighted interval score has slightly changed in version 0.1.5. If you want to use the old definition, use the argument `count_median_twice = TRUE` in the function `eval_forecasts()`
#> 
#> Attaching package: 'scoringutils'
#> The following object is masked from 'package:purrr':
#> 
#>     update_list

suppressWarnings({
  df %>%
    filter(n_quantiles == 23) %>%
    select(model, target_variable, horizon, location, location_name,
           forecast_date, interval_score = wis) %>%
    pairwise_comparison(
      metric = "interval_score",
      baseline = "EuroCOVIDhub-baseline",
      by = c("model", "target_variable", "forecast_date", "horizon",
             "location", "location_name"),
      summarise_by = c("model", "target_variable", "horizon", "location")
    ) %>%
    select(model, target_variable, horizon, location, rel_wis = scaled_rel_skill) %>%
    distinct()
})
#>                    model target_variable horizon location rel_wis
#> 1: EuroCOVIDhub-baseline       inc death       3       IS     NaN
#> 2: EuroCOVIDhub-ensemble       inc death       3       IS      NA
#> 3: EuroCOVIDhub-ensemble       inc death       3       IS     NaN
#> 4:               ILM-EKF       inc death       3       IS     NaN
#> 5:            MUNI-ARIMA       inc death       3       IS     NaN
#> 6:    RobertWalraven-ESG       inc death       3       IS     NaN
#> 7:       UMass-MechBayes       inc death       3       IS     NaN
#> 8:         USC-SIkJalpha       inc death       3       IS     NaN
#> 9:  epiforecasts-EpiNow2       inc death       3       IS       0

Created on 2021-11-10 by the reprex package (v2.0.1.9000)

As you can see here, we have two almost identical lines for EuroCOVIDhub-ensemble excepted that one is with NA and the other with NaN. The fact that a model can have more than one row cause issues in downstream analyses in our cases.

Description of the problem

It looks like pairwise_comparison() sometimes returns NA and sometimes NaN when it cannot compute the value. This leads to confusion because both NA and NaN indicate almost the same thing but they have strange incompatibilites:

identical(NA, NaN)
#> [1] FALSE

NA + NaN
#> [1] NA

NaN + NA
#> [1] NaN

Created on 2021-11-10 by the reprex package (v2.0.1.9000)

Proposed solution

NaN is rarely used and confusing (IMO). They often appear because the function doesn't control the output for errors/impossible computations. And they can cause serious issues in downstream analyses (such as in our case). Tomas Kalibera gives a good overview of the hell that is NA vs NaN:

The result of binary operations involving NA and NaN is hardware dependent (the propagation of NaN payload) - on some hardware, it actually works the way we would like - NA is returned - but on some hardware you get NaN or sometimes NA and sometimes NaN. Also there are C compiler optimizations re-ordering code, as mentioned in ?NaN. Then there are also external numerical libraries that do not distinguish NA from NaN (NA is an R concept)

So we should stick to only one of these. As far as I know, NA always propagates as NA while f(NaN) can return NA or NaN depending on f() so a conscious choice to always output NA would be much better / clearer IMO.

Bisaloo commented 2 years ago

Ah, I was so slow to post this issue that there is already a fix: https://github.com/epiforecasts/scoringutils/pull/139.

sbfnk commented 2 years ago

Good to have the problem laid out in detail, though - I chose to make everything NaN but it sounds like we're better of with NA.

For what it's worth the problem in our case was:

> dt <- data.table(id = c(rep(1, 3), rep(2, 3), rep(3, 3)), a = c(as.vector(replicate(3, c(NA_real_, rep(rnorm(1), 2))))))
> dt[, a := unique(na.omit(a)), by = id]
> unique(dt)
   id          a
1:  1 0.05785241
2:  2 0.05029081
3:  3 0.57708069
> dt <- data.table(id = c(rep(1, 3), rep(2, 3), rep(3, 3)), a = c(as.vector(replicate(2, c(NA_real_, rep(rnorm(1), 2)))), c(NA_real_, rep(NaN, 2))))
> dt[, a := unique(na.omit(a)), by = id]
> unique(dt)
   id           a
1:  1  0.09825976
2:  2 -1.06038560
3:  3          NA
4:  3         NaN

When saving to csv the second data table leads to a duplicated row.

nikosbosse commented 2 years ago

Hm wouldn't it be best to assign NA_real to theta in #139?

Bisaloo commented 2 years ago

It has been changed after (but directly to master): https://github.com/epiforecasts/scoringutils/commit/24b1efc083ea0d4096855f1b27dba9c1d7a98052

nikosbosse commented 2 years ago

ah. perfect!