excessive warning for integrated metrics using iterative search and racing methods

topepo commented 10 months ago

`fit_resamples()`, `last_fit()` and the `tune_*()` functions all compute performance metrics for models. For survival analysis, there is the added complication that some metrics are dynamic; they are computed over 1+ evaluation times. Metric sets can include static metrics (no eval times), dynamic metrics (one metric for each evaluation time), and integrated metrics (one metric computed but requires 2+ times as input). If a user specifies a metric set that does not contain dynamic or integrated metrics but passes evaluation times, they should get this warning: > Evaluation times are only required when dynmanic or integrated metrics are selected as the primary metric (and will be ignored). (they do) However, to complicate things more, some of the tuning functions (Bayes, sim annealing, and racing) use a single metric as an objective function to optimize. The convention is that the first metric in a metric set is the one being optimized, and if the metric is dynamic, the first evaluation time defines the objective function. Other evaluation times can still be passed. What happens depends on the first metric: - **static**: eval times are not used for optimization. - **integrated**: we use them all (as usual). - **dynamic**: we compute the other eval time results, but only the metric associated with the first time is used for optimization.

When an integrated metric is used for optimization, we are incorrectly issuing the warning:

Evaluation times are only required when dynmanic or integrated metrics are selected as the primary metric (and will be ignored).

(also there is a misspelling there :rolls-eyes:)

reprex after running pak::pak(paste0("tidymodels/", c("tune", "censored", "yardstick")), ask = FALSE)

library(tidymodels)
library(censored)
#> Loading required package: survival
library(prodlim)
# also required glmnet


tidymodels_prefer()
theme_set(theme_bw())
options(pillar.advice = FALSE, pillar.min_title_chars = Inf)


set.seed(1)
sim_dat <- prodlim::SimSurv(500) %>%
  mutate(event_time = Surv(time, event)) %>%
  select(event_time, X1, X2)

set.seed(2)
split <- initial_split(sim_dat)
sim_tr <- training(split)
sim_te <- testing(split)
sim_rs <- vfold_cv(sim_tr)

time_points <- c(10, 1, 5, 15)

mod_spec <-
  proportional_hazards(penalty = tune()) %>%
  set_engine("glmnet") %>%
  set_mode("censored regression")

sint_mtrc <- metric_set(brier_survival_integrated)

set.seed(2193)
bayes_integrated_res <-
  mod_spec %>%
  tune_bayes(
    event_time ~ X1 + X2,
    resamples = sim_rs,
    iter = 2,
    metrics = sint_mtrc,
    eval_time = time_points,
    initial = 4
  )
#> Warning: Evaluation times are only required when dynmanic or integrated metrics are
#> selected as the primary metric (and will be ignored).

^{Created on 2024-01-09 with reprex v2.0.2}

Session info

``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.3.2 (2023-10-31) #> os macOS Sonoma 14.2.1 #> system aarch64, darwin20 #> ui X11 #> language (EN) #> collate en_US.UTF-8 #> ctype en_US.UTF-8 #> tz America/New_York #> date 2024-01-09 #> pandoc 3.1.11 @ /opt/homebrew/bin/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date (UTC) lib source #> backports 1.4.1 2021-12-13 [1] CRAN (R 4.3.0) #> broom * 1.0.5 2023-06-09 [1] CRAN (R 4.3.0) #> cachem 1.0.8 2023-05-01 [1] CRAN (R 4.3.0) #> censored * 0.2.0.9001 2024-01-09 [1] Github (tidymodels/censored@6f38052) #> class 7.3-22 2023-05-03 [2] CRAN (R 4.3.2) #> cli 3.6.2 2023-12-11 [1] CRAN (R 4.3.1) #> codetools 0.2-19 2023-02-01 [2] CRAN (R 4.3.2) #> colorspace 2.1-0 2023-01-23 [1] CRAN (R 4.3.0) #> conflicted 1.2.0 2023-02-01 [1] CRAN (R 4.3.0) #> data.table 1.14.10 2023-12-08 [1] CRAN (R 4.3.1) #> dials * 1.2.0 2023-04-03 [1] CRAN (R 4.3.0) #> DiceDesign 1.10 2023-12-07 [1] CRAN (R 4.3.2) #> digest 0.6.33 2023-07-07 [1] CRAN (R 4.3.0) #> dplyr * 1.1.4 2023-11-17 [1] CRAN (R 4.3.1) #> evaluate 0.23 2023-11-01 [1] CRAN (R 4.3.1) #> fansi 1.0.6 2023-12-08 [1] CRAN (R 4.3.1) #> fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.0) #> foreach 1.5.2 2022-02-02 [1] CRAN (R 4.3.0) #> fs 1.6.3 2023-07-20 [1] CRAN (R 4.3.0) #> furrr 0.3.1 2022-08-15 [1] CRAN (R 4.3.0) #> future 1.33.1 2023-12-22 [1] CRAN (R 4.3.1) #> future.apply 1.11.1 2023-12-21 [1] CRAN (R 4.3.1) #> generics 0.1.3 2022-07-05 [1] CRAN (R 4.3.0) #> ggplot2 * 3.4.4 2023-10-12 [1] CRAN (R 4.3.1) #> glmnet * 4.1-8 2023-08-22 [1] CRAN (R 4.3.0) #> globals 0.16.2 2022-11-21 [1] CRAN (R 4.3.0) #> glue 1.6.2 2022-02-24 [1] CRAN (R 4.3.0) #> gower 1.0.1 2022-12-22 [1] CRAN (R 4.3.0) #> GPfit 1.0-8 2019-02-08 [1] CRAN (R 4.3.0) #> gtable 0.3.4 2023-08-21 [1] CRAN (R 4.3.0) #> hardhat 1.3.0 2023-03-30 [1] CRAN (R 4.3.0) #> htmltools 0.5.7 2023-11-03 [1] CRAN (R 4.3.1) #> infer * 1.0.5 2023-09-06 [1] CRAN (R 4.3.0) #> ipred 0.9-14 2023-03-09 [1] CRAN (R 4.3.0) #> iterators 1.0.14 2022-02-05 [1] CRAN (R 4.3.0) #> knitr 1.45 2023-10-30 [1] CRAN (R 4.3.1) #> lattice 0.22-5 2023-10-24 [1] CRAN (R 4.3.1) #> lava 1.7.3 2023-11-04 [1] CRAN (R 4.3.1) #> lhs 1.1.6 2022-12-17 [1] CRAN (R 4.3.0) #> lifecycle 1.0.4 2023-11-07 [1] CRAN (R 4.3.1) #> listenv 0.9.0 2022-12-16 [1] CRAN (R 4.3.0) #> lubridate 1.9.3 2023-09-27 [1] CRAN (R 4.3.1) #> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.3.0) #> MASS 7.3-60 2023-05-04 [2] CRAN (R 4.3.2) #> Matrix * 1.6-4 2023-11-30 [1] CRAN (R 4.3.2) #> memoise 2.0.1 2021-11-26 [1] CRAN (R 4.3.0) #> modeldata * 1.2.0.9000 2023-12-21 [1] local #> modelenv 0.1.1 2023-03-08 [1] CRAN (R 4.3.0) #> munsell 0.5.0 2018-06-12 [1] CRAN (R 4.3.0) #> nnet 7.3-19 2023-05-03 [2] CRAN (R 4.3.2) #> parallelly 1.36.0 2023-05-26 [1] CRAN (R 4.3.0) #> parsnip * 1.1.1.9007 2023-12-17 [1] Github (tidymodels/parsnip@8f13c1c) #> pillar 1.9.0 2023-03-22 [1] CRAN (R 4.3.0) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.3.0) #> prodlim * 2023.08.28 2023-08-28 [1] CRAN (R 4.3.0) #> purrr * 1.0.2 2023-08-10 [1] CRAN (R 4.3.0) #> R.cache 0.16.0 2022-07-21 [1] CRAN (R 4.3.0) #> R.methodsS3 1.8.2 2022-06-13 [1] CRAN (R 4.3.0) #> R.oo 1.25.0 2022-06-12 [1] CRAN (R 4.3.0) #> R.utils 2.12.3 2023-11-18 [1] CRAN (R 4.3.1) #> R6 2.5.1 2021-08-19 [1] CRAN (R 4.3.0) #> Rcpp 1.0.11 2023-07-06 [1] CRAN (R 4.3.0) #> recipes * 1.0.9 2023-12-13 [1] CRAN (R 4.3.1) #> reprex 2.0.2 2022-08-17 [1] CRAN (R 4.3.0) #> rlang 1.1.2 2023-11-04 [1] CRAN (R 4.3.1) #> rmarkdown 2.25 2023-09-18 [1] CRAN (R 4.3.1) #> rpart 4.1.23 2023-12-05 [1] CRAN (R 4.3.1) #> rsample * 1.2.0 2023-08-23 [1] CRAN (R 4.3.0) #> rstudioapi 0.15.0 2023-07-07 [1] CRAN (R 4.3.0) #> scales * 1.3.0 2023-11-28 [1] CRAN (R 4.3.1) #> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.0) #> shape 1.4.6 2021-05-19 [1] CRAN (R 4.3.0) #> styler 1.10.2 2023-08-29 [1] CRAN (R 4.3.0) #> survival * 3.5-7 2023-08-14 [2] CRAN (R 4.3.2) #> tibble * 3.2.1 2023-03-20 [1] CRAN (R 4.3.0) #> tidymodels * 1.1.1 2023-08-24 [1] CRAN (R 4.3.0) #> tidyr * 1.3.0 2023-01-24 [1] CRAN (R 4.3.0) #> tidyselect 1.2.0 2022-10-10 [1] CRAN (R 4.3.0) #> timechange 0.2.0 2023-01-11 [1] CRAN (R 4.3.0) #> timeDate 4032.109 2023-12-14 [1] CRAN (R 4.3.1) #> tune * 1.1.2.9008 2024-01-09 [1] Github (tidymodels/tune@1ccb2d7) #> utf8 1.2.4 2023-10-22 [1] CRAN (R 4.3.1) #> vctrs 0.6.5 2023-12-01 [1] CRAN (R 4.3.1) #> withr 2.5.2 2023-10-30 [1] CRAN (R 4.3.1) #> workflows * 1.1.3 2023-02-22 [1] CRAN (R 4.3.0) #> workflowsets * 1.0.1 2023-04-06 [1] CRAN (R 4.3.0) #> xfun 0.41 2023-11-01 [1] CRAN (R 4.3.1) #> yaml 2.3.8 2023-12-11 [1] CRAN (R 4.3.1) #> yardstick * 1.2.0.9002 2024-01-05 [1] Github (tidymodels/yardstick@53e1f1c) #> #> [1] /Users/max/Library/R/arm64/4.3/library #> [2] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library #> #> ────────────────────────────────────────────────────────────────────────────── ```

topepo commented 10 months ago

Examples:

https://github.com/tidymodels/extratests/pull/158#discussion_r1443308352

https://github.com/tidymodels/extratests/pull/158#discussion_r1445080287

https://github.com/tidymodels/extratests/pull/158#discussion_r1445107097

topepo commented 10 months ago

We'll need to adjust these lines:

  # Not a metric that requires an eval_time
  no_time_req <- c("static_survival_metric", "integrated_survival_metric")
  if (mtr_info$class %in% no_time_req) {
    if (num_times > 0) {
      cli::cli_warn("Evaluation times are only required when dynmanic or
                     integrated metrics are selected as the primary metric
                     (and will be ignored).")
    }
    return(NULL)
  }

topepo commented 10 months ago

Actually, I believe that we can remove this warning and keep return(NULL). This function is called to decide what the first metric (and eval time) is so that it can be used for select_best() and when an objective function is needed (e.g. Bayesian optimization, racing, etc).

We do check to make sure that the original input argument has correctly specified times (if needed) in check_eval_time_arg() with

  if (max_times_req > num_times) {
    cli::cli_abort("At least {max_times_req} evaluation time{?s} {?is/are}
                   required for the metric type(s) requested: {.val {uni_cls}}.
                   Only {num_times} unique time{?s} {?was/were} given.",
                   call = call)
  }

github-actions[bot] commented 9 months ago

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

tidymodels / tune

excessive warning for integrated metrics using iterative search and racing methods #802