tidymodels / yardstick

Tidy methods for measuring model performance
https://yardstick.tidymodels.org/
Other
369 stars 55 forks source link

using concordance_survival with aorsf #475

Closed TSI-PTG closed 10 months ago

TSI-PTG commented 10 months ago

I'm experiencing a compatibility issue with aorsf and concordance_survival.

library(tidymodels)
library(censored)
#> Loading required package: survival

tidymodels_prefer()

data(cancer)

lung <- lung %>%
    drop_na() %>%
    tibble() %>%
    mutate(surv = Surv(time, status))

lung_train <- lung %>% vfold_cv(strata = status, v = 10)

mod_spec <- rand_forest(mtry = tune(), trees = tune(), min_n = tune()) %>%
    set_engine("aorsf") %>%
    set_mode("censored regression")

recipe_lung <- recipe(surv ~ ., data = lung) %>%
    update_role(c(inst, time, status), new_role = "ID")

workflow_lung <- workflow() %>%
    add_recipe(recipe_lung) %>%
    add_model(mod_spec)

params_finalized <- extract_parameter_set_dials(workflow_lung) %>%
    recipes::update(mtry = finalize(mtry(), lung %>% dplyr::select(-surv, -inst, -time, -status)))

tune_lung <- workflow_lung %>%
    tune_grid(
        lung_train,
        param_info = params_finalized,
        eval_time = 365,
        metrics = metric_set(concordance_survival)
    )
#> Warning: Evaluation times are only required when dynmanic or integrated metrics are used
#> (and will be ignored here).
#> → A | error:   No time prediction method available for this model.
#>                • Value for `type` should be one of: 'survival'
#> There were issues with some computations   A: x1
#> There were issues with some computations   A: x16
#> There were issues with some computations   A: x34
#> There were issues with some computations   A: x54
#> There were issues with some computations   A: x73
#> There were issues with some computations   A: x92
#> There were issues with some computations   A: x100
#> 
#> Warning: All models failed. Run `show_notes(.Last.tune.result)` for more
#> information.

Created on 2024-01-06 with reprex v2.0.2

EmilHvitfeldt commented 10 months ago

This is happening because concordance_survival() uses the signature

concordance_survival(
  data = ,
  truth = surv_obj,
  estimate = .pred_time
)

since the aorsf engine can only do predict(, type = "survival") we are meet with an error https://github.com/tidymodels/censored/blob/6f38052b1cb5468fdbaf0bdea0042b6c81c00722/README.md?plain=1#L55

hfrick commented 10 months ago

FYI I've made a feature request for this to the maintainer of aorsf 🤞

EmilHvitfeldt commented 10 months ago

Thank you @hfrick! I'm closing this issue as it isn't a yardstick issue anymore

bcjaeger commented 9 months ago

Hi, @TSI-PTG,

Thanks for raising this! I am working on getting survival time prediction incorporated into aorsf. While I was reviewing this issue, I noticed you had specified eval_time = 365 in your call to tune_grid(). Since you are interested in a specific evaluation time, is it possible you intended to use roc_auc_survival instead of concordance_survival as your metric? I believe roc_auc_survival computes a time-dependent C-statistic (i.e., for predicted survival probability at eval_time = 365). On the other hand, concordance_survival computes Harrell's C-statistic, which does not depend on eval_time.

A reprex is below. I used dev versions of tune and yardstick to run it:

library(tidymodels)
library(censored)
#> Warning: package 'censored' was built under R version 4.3.2
#> Loading required package: survival

tidymodels_prefer()

data(cancer)

lung <- lung %>%
 drop_na() %>%
 tibble() %>%
 mutate(surv = Surv(time, status))

lung_train <- lung %>% vfold_cv(strata = status, v = 10)

mod_spec <- rand_forest(mtry = tune(), trees = tune(), min_n = tune()) %>%
 set_engine("aorsf") %>%
 set_mode("censored regression")

recipe_lung <- recipe(surv ~ ., data = lung) %>%
 update_role(c(inst, time, status), new_role = "ID")

workflow_lung <- workflow() %>%
 add_recipe(recipe_lung) %>%
 add_model(mod_spec)

params_finalized <- extract_parameter_set_dials(workflow_lung) %>%
 recipes::update(mtry = finalize(mtry(), lung %>% dplyr::select(-surv, -inst, -time, -status)))

tune_lung <- workflow_lung %>%
 tune_grid(
  lung_train,
  param_info = params_finalized,
  eval_time = 365,
  metrics = metric_set(roc_auc_survival)
 )

tune_lung
#> # Tuning results
#> # 10-fold cross-validation using stratification 
#> # A tibble: 10 × 4
#>    splits           id     .metrics          .notes          
#>    <list>           <chr>  <list>            <list>          
#>  1 <split [150/17]> Fold01 <tibble [10 × 8]> <tibble [0 × 3]>
#>  2 <split [150/17]> Fold02 <tibble [10 × 8]> <tibble [0 × 3]>
#>  3 <split [150/17]> Fold03 <tibble [10 × 8]> <tibble [0 × 3]>
#>  4 <split [150/17]> Fold04 <tibble [10 × 8]> <tibble [0 × 3]>
#>  5 <split [150/17]> Fold05 <tibble [10 × 8]> <tibble [0 × 3]>
#>  6 <split [150/17]> Fold06 <tibble [10 × 8]> <tibble [0 × 3]>
#>  7 <split [150/17]> Fold07 <tibble [10 × 8]> <tibble [0 × 3]>
#>  8 <split [151/16]> Fold08 <tibble [10 × 8]> <tibble [0 × 3]>
#>  9 <split [151/16]> Fold09 <tibble [10 × 8]> <tibble [0 × 3]>
#> 10 <split [151/16]> Fold10 <tibble [10 × 8]> <tibble [0 × 3]>

Created on 2024-01-14 with reprex v2.0.2

TSI-PTG commented 9 months ago

Hi @bcjaeger,

Much appreciated for all your work on this! I was testing it with concordance survival, but the eval_time argument was just carried forward from some prior work I was doing. Sorry for the confusion.

github-actions[bot] commented 9 months ago

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.