larmarange / broom.helpers

A set of functions to facilitate manipulation of tibbles produced by broom
https://larmarange.github.io/broom.helpers/
GNU General Public License v3.0
21 stars 8 forks source link

Support for survival::cch model? #242

Closed jalavery closed 7 months ago

jalavery commented 7 months ago

Is it possible to add support for a survival::cch model?

Please see below for an example model and attempt to tidy it:

library(survival)
#> Warning: package 'survival' was built under R version 4.1.3
library(gtsummary)
#> Warning: package 'gtsummary' was built under R version 4.1.3

# case-cohort model using the survival::cch function
# example from survival::cch()

## The complete Wilms Tumor Data 
## (Breslow and Chatterjee, Applied Statistics, 1999)
## subcohort selected by simple random sampling.
##

subcoh <- nwtco$in.subcohort
selccoh <- with(nwtco, rel==1|subcoh==1)
ccoh.data <- nwtco[selccoh,]
ccoh.data$subcohort <- subcoh[selccoh]
## central-lab histology 
ccoh.data$histol <- factor(ccoh.data$histol,labels=c("FH","UH"))
## tumour stage
ccoh.data$stage <- factor(ccoh.data$stage,labels=c("I","II","III","IV"))
ccoh.data$age <- ccoh.data$age/12 # Age in years

##
## Standard case-cohort analysis: simple random subcohort 
##

fit.ccP <- cch(Surv(edrel, rel) ~ stage + histol + age, data =ccoh.data,
               subcoh = ~subcohort, id=~seqno, cohort.size=4028)

fit.ccP
#> Case-cohort analysis,x$method, Prentice 
#>  with subcohort of 668 from cohort of 4028 
#> 
#> Call: cch(formula = Surv(edrel, rel) ~ stage + histol + age, data = ccoh.data, 
#>     subcoh = ~subcohort, id = ~seqno, cohort.size = 4028)
#> 
#> Coefficients:
#>               Value         SE        Z            p
#> stageII  0.73457084 0.16849620 4.359569 1.303187e-05
#> stageIII 0.59708356 0.17345094 3.442377 5.766257e-04
#> stageIV  1.38413197 0.20481982 6.757803 1.400990e-11
#> histolUH 1.49806307 0.15970515 9.380180 0.000000e+00
#> age      0.04326787 0.02373086 1.823274 6.826184e-02

# tidy model output: success
broom::tidy(fit.ccP)
#> # A tibble: 5 x 7
#>   term     estimate std.error statistic  p.value conf.low conf.high
#>   <chr>       <dbl>     <dbl>     <dbl>    <dbl>    <dbl>     <dbl>
#> 1 stageII    0.735     0.168       4.36 1.30e- 5  0.404      1.06  
#> 2 stageIII   0.597     0.173       3.44 5.77e- 4  0.257      0.937 
#> 3 stageIV    1.38      0.205       6.76 1.40e-11  0.983      1.79  
#> 4 histolUH   1.50      0.160       9.38 0         1.19       1.81  
#> 5 age        0.0433    0.0237      1.82 6.83e- 2 -0.00324    0.0898

# this fails
broom.helpers::tidy_plus_plus(fit.ccP)
#> Warning: The `exponentiate` argument is not supported in the `tidy()` method
#> for `cch` objects and will be ignored.
#> ! `broom::tidy()` failed to tidy the model.
#> Warning: Some model terms could not be found in model data.
#>   You probably need to load the data into the environment.
#> v `tidy_parameters()` used instead.
#> i Add `tidy_fun = broom.helpers::tidy_parameters` to quiet these messages.
#> # A tibble: 0 x 19
#> # ... with 19 variables: term <chr>, variable <chr>, var_label <chr>,
#> #   var_class <chr>, var_type <chr>, var_nlevels <int>, contrasts <chr>,
#> #   contrasts_type <chr>, reference_row <lgl>, label <chr>, estimate <dbl>,
#> #   std.error <dbl>, conf.level <dbl>, conf.low <dbl>, conf.high <dbl>,
#> #   statistic <dbl>, df.error <dbl>, p.value <dbl>, n <dbl>

# originally discovered issue trying to summarize model via tbl_regression
gtsummary::tbl_regression(fit.ccP)
#> Warning: The `exponentiate` argument is not supported in the `tidy()` method
#> for `cch` objects and will be ignored.
#> ! `broom::tidy()` failed to tidy the model.
#> Warning: Some model terms could not be found in model data.
#>   You probably need to load the data into the environment.
#> v `tidy_parameters()` used instead.
#> i Add `tidy_fun = broom.helpers::tidy_parameters` to quiet these messages.
#> Error in if (result$label %in% c("Beta", "exp(Beta)")) {: argument is of length zero

Created on 2024-01-23 with reprex v2.0.2

Session info ``` r sessioninfo::session_info() #> - Session info --------------------------------------------------------------- #> setting value #> version R version 4.1.2 (2021-11-01) #> os Windows 10 x64 (build 19045) #> system x86_64, mingw32 #> ui RTerm #> language (EN) #> collate English_United States.1252 #> ctype English_United States.1252 #> tz America/New_York #> date 2024-01-23 #> pandoc 3.1.1 @ C:/Program Files/RStudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown) #> #> - Packages ------------------------------------------------------------------- #> package * version date (UTC) lib source #> backports 1.4.1 2021-12-13 [1] CRAN (R 4.1.2) #> bayestestR 0.13.0 2022-09-18 [1] CRAN (R 4.1.3) #> broom 1.0.3 2023-01-25 [1] CRAN (R 4.1.3) #> broom.helpers 1.12.0 2023-02-09 [1] CRAN (R 4.1.3) #> cli 3.4.1 2022-09-23 [1] CRAN (R 4.1.3) #> coda 0.19-4 2020-09-30 [1] CRAN (R 4.0.3) #> codetools 0.2-18 2020-11-04 [1] CRAN (R 4.1.2) #> colorspace 2.1-0 2023-01-23 [1] CRAN (R 4.1.3) #> datawizard 0.6.4 2022-11-19 [1] CRAN (R 4.1.3) #> digest 0.6.30 2022-10-18 [1] CRAN (R 4.1.3) #> dplyr 1.1.0 2023-01-29 [1] CRAN (R 4.1.3) #> effectsize 0.8.2 2022-10-31 [1] CRAN (R 4.1.3) #> ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.0.5) #> emmeans 1.8.2 2022-10-27 [1] CRAN (R 4.1.3) #> estimability 1.4.1 2022-08-05 [1] CRAN (R 4.1.3) #> evaluate 0.20 2023-01-17 [1] CRAN (R 4.1.3) #> fansi 1.0.4 2023-01-22 [1] CRAN (R 4.1.3) #> fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.0.5) #> forcats 1.0.0 2023-01-29 [1] CRAN (R 4.1.3) #> fs 1.5.2 2021-12-08 [1] CRAN (R 4.1.3) #> generics 0.1.3 2022-07-05 [1] CRAN (R 4.1.3) #> ggplot2 3.4.1 2023-02-10 [1] CRAN (R 4.1.3) #> glue 1.6.2 2022-02-24 [1] CRAN (R 4.1.2) #> gt 0.8.0 2022-11-16 [1] CRAN (R 4.1.3) #> gtable 0.3.1 2022-09-01 [1] CRAN (R 4.1.3) #> gtsummary * 1.7.0 2023-01-13 [1] CRAN (R 4.1.3) #> haven 2.5.2 2023-02-28 [1] CRAN (R 4.1.3) #> hms 1.1.2 2022-08-19 [1] CRAN (R 4.1.3) #> htmltools 0.5.3 2022-07-18 [1] CRAN (R 4.1.3) #> insight 0.18.8 2022-11-24 [1] CRAN (R 4.1.3) #> knitr 1.42 2023-01-25 [1] CRAN (R 4.1.3) #> labelled 2.10.0 2022-09-14 [1] CRAN (R 4.1.3) #> lattice 0.20-45 2021-09-22 [1] CRAN (R 4.1.2) #> lifecycle 1.0.3 2022-10-07 [1] CRAN (R 4.1.3) #> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.1.3) #> MASS 7.3-58.1 2022-08-03 [1] CRAN (R 4.1.3) #> Matrix 1.5-3 2022-11-11 [1] CRAN (R 4.1.3) #> multcomp 1.4-20 2022-08-07 [1] CRAN (R 4.1.3) #> munsell 0.5.0 2018-06-12 [1] CRAN (R 4.0.0) #> mvtnorm 1.1-3 2021-10-08 [1] CRAN (R 4.1.1) #> parameters 0.20.0 2022-11-21 [1] CRAN (R 4.1.3) #> pillar 1.8.1 2022-08-19 [1] CRAN (R 4.1.3) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.0.0) #> purrr 1.0.1 2023-01-10 [1] CRAN (R 4.1.3) #> R.cache 0.16.0 2022-07-21 [1] CRAN (R 4.1.3) #> R.methodsS3 1.8.2 2022-06-13 [1] CRAN (R 4.1.3) #> R.oo 1.25.0 2022-06-12 [1] CRAN (R 4.1.3) #> R.utils 2.12.2 2022-11-11 [1] CRAN (R 4.1.3) #> R6 2.5.1 2021-08-19 [1] CRAN (R 4.1.1) #> reprex 2.0.2 2022-08-17 [1] CRAN (R 4.1.3) #> rlang 1.1.1 2023-04-28 [1] CRAN (R 4.1.2) #> rmarkdown 2.23 2023-07-01 [1] CRAN (R 4.1.2) #> rstudioapi 0.14 2022-08-22 [1] CRAN (R 4.1.3) #> sandwich 3.0-2 2022-06-15 [1] CRAN (R 4.1.3) #> scales 1.2.1 2022-08-20 [1] CRAN (R 4.1.3) #> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.1.3) #> stringi 1.7.6 2021-11-29 [1] CRAN (R 4.1.2) #> stringr 1.5.0 2022-12-02 [1] CRAN (R 4.1.3) #> styler 1.8.1 2022-11-07 [1] CRAN (R 4.1.3) #> survival * 3.4-0 2022-08-09 [1] CRAN (R 4.1.3) #> TH.data 1.1-1 2022-04-26 [1] CRAN (R 4.1.3) #> tibble 3.1.8 2022-07-22 [1] CRAN (R 4.1.3) #> tidyr 1.3.0 2023-01-24 [1] CRAN (R 4.1.3) #> tidyselect 1.2.0 2022-10-10 [1] CRAN (R 4.1.2) #> utf8 1.2.3 2023-01-31 [1] CRAN (R 4.1.3) #> vctrs 0.6.3 2023-06-14 [1] CRAN (R 4.1.2) #> withr 2.5.0 2022-03-03 [1] CRAN (R 4.1.2) #> xfun 0.37 2023-01-31 [1] CRAN (R 4.1.3) #> xtable 1.8-4 2019-04-21 [1] CRAN (R 4.0.0) #> yaml 2.3.7 2023-01-23 [1] CRAN (R 4.1.3) #> zoo 1.8-11 2022-09-17 [1] CRAN (R 4.1.3) #> #> [1] C:/Program Files/R/R-4.1.2/library #> #> ------------------------------------------------------------------------------ ```
larmarange commented 7 months ago

Experimental support has been added to PR #243

Please note that you need to indicate exponentiate = TRUE

jalavery commented 7 months ago

Fantastic, thank you for such a quick update @larmarange!

jalavery commented 7 months ago

I re-ran the test example, and it looks like even with exponentiate = TRUE, the values displayed on the table are the un-exponentiated values. Would you mind please looking into this whenever you have a chance? Thank you!

library(survival)
#> Warning: package 'survival' was built under R version 4.1.3
library(gtsummary)
#> Warning: package 'gtsummary' was built under R version 4.1.3

# case-cohort model using the survival::cch function
# example from survival::cch()

## The complete Wilms Tumor Data
## (Breslow and Chatterjee, Applied Statistics, 1999)
## subcohort selected by simple random sampling.
##

subcoh <- nwtco$in.subcohort
selccoh <- with(nwtco, rel==1|subcoh==1)
ccoh.data <- nwtco[selccoh,]
ccoh.data$subcohort <- subcoh[selccoh]
## central-lab histology
ccoh.data$histol <- factor(ccoh.data$histol,labels=c("FH","UH"))
## tumour stage
ccoh.data$stage <- factor(ccoh.data$stage,labels=c("I","II","III","IV"))
ccoh.data$age <- ccoh.data$age/12 # Age in years

##
## Standard case-cohort analysis: simple random subcohort
##

fit.ccP <- cch(Surv(edrel, rel) ~ stage + histol + age, data =ccoh.data,
               subcoh = ~subcohort, id=~seqno, cohort.size=4028)

summary(fit.ccP)
#> Case-cohort analysis,x$method, Prentice 
#>  with subcohort of 668 from cohort of 4028 
#> 
#> Call: cch(formula = Surv(edrel, rel) ~ stage + histol + age, data = ccoh.data, 
#>     subcoh = ~subcohort, id = ~seqno, cohort.size = 4028)
#> 
#> Coefficients:
#>           Coef    HR  (95%   CI)     p
#> stageII  0.735 2.085 1.498 2.900 0.000
#> stageIII 0.597 1.817 1.293 2.552 0.001
#> stageIV  1.384 3.991 2.672 5.963 0.000
#> histolUH 1.498 4.473 3.271 6.117 0.000
#> age      0.043 1.044 0.997 1.094 0.068

# tidy model output: success
broom::tidy(fit.ccP)
#> # A tibble: 5 x 7
#>   term     estimate std.error statistic  p.value conf.low conf.high
#>   <chr>       <dbl>     <dbl>     <dbl>    <dbl>    <dbl>     <dbl>
#> 1 stageII    0.735     0.168       4.36 1.30e- 5  0.404      1.06  
#> 2 stageIII   0.597     0.173       3.44 5.77e- 4  0.257      0.937 
#> 3 stageIV    1.38      0.205       6.76 1.40e-11  0.983      1.79  
#> 4 histolUH   1.50      0.160       9.38 0         1.19       1.81  
#> 5 age        0.0433    0.0237      1.82 6.83e- 2 -0.00324    0.0898

# this now runs
broom.helpers::tidy_plus_plus(fit.ccP, exponentiate = TRUE) %>% 
  select(term, label, estimate)
#> # A tibble: 7 x 3
#>   term     label estimate
#>   <chr>    <chr>    <dbl>
#> 1 stageI   I       1     
#> 2 stageII  II      0.735 
#> 3 stageIII III     0.597 
#> 4 stageIV  IV      1.38  
#> 5 histolFH FH      1     
#> 6 histolUH UH      1.50  
#> 7 age      age     0.0433

# however, the log(HR) is being displayed as the HR, and the exponentiation seems to not be working
# HR for stage II should be exp(0.74) = 2.08 based on summary(fit.ccP)
# gtsummary::tbl_regression(fit.ccP, exponentiate = TRUE)

Created on 2024-01-24 with reprex v2.0.2

Session info ``` r sessioninfo::session_info() #> - Session info --------------------------------------------------------------- #> setting value #> version R version 4.1.2 (2021-11-01) #> os Windows 10 x64 (build 19045) #> system x86_64, mingw32 #> ui RTerm #> language (EN) #> collate English_United States.1252 #> ctype English_United States.1252 #> tz America/New_York #> date 2024-01-24 #> pandoc 3.1.1 @ C:/Program Files/RStudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown) #> #> - Packages ------------------------------------------------------------------- #> package * version date (UTC) lib source #> backports 1.4.1 2021-12-13 [1] CRAN (R 4.1.2) #> broom 1.0.5 2023-06-09 [1] CRAN (R 4.1.2) #> broom.helpers 1.14.0.9000 2024-01-24 [1] local #> cli 3.4.1 2022-09-23 [1] CRAN (R 4.1.3) #> colorspace 2.1-0 2023-01-23 [1] CRAN (R 4.1.3) #> digest 0.6.30 2022-10-18 [1] CRAN (R 4.1.3) #> dplyr 1.1.0 2023-01-29 [1] CRAN (R 4.1.3) #> ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.0.5) #> evaluate 0.20 2023-01-17 [1] CRAN (R 4.1.3) #> fansi 1.0.4 2023-01-22 [1] CRAN (R 4.1.3) #> fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.0.5) #> forcats 1.0.0 2023-01-29 [1] CRAN (R 4.1.3) #> fs 1.5.2 2021-12-08 [1] CRAN (R 4.1.3) #> generics 0.1.3 2022-07-05 [1] CRAN (R 4.1.3) #> ggplot2 3.4.1 2023-02-10 [1] CRAN (R 4.1.3) #> glue 1.6.2 2022-02-24 [1] CRAN (R 4.1.2) #> gt 0.8.0 2022-11-16 [1] CRAN (R 4.1.3) #> gtable 0.3.1 2022-09-01 [1] CRAN (R 4.1.3) #> gtsummary * 1.7.0 2023-01-13 [1] CRAN (R 4.1.3) #> haven 2.5.2 2023-02-28 [1] CRAN (R 4.1.3) #> hms 1.1.2 2022-08-19 [1] CRAN (R 4.1.3) #> htmltools 0.5.3 2022-07-18 [1] CRAN (R 4.1.3) #> knitr 1.42 2023-01-25 [1] CRAN (R 4.1.3) #> labelled 2.10.0 2022-09-14 [1] CRAN (R 4.1.3) #> lattice 0.20-45 2021-09-22 [1] CRAN (R 4.1.2) #> lifecycle 1.0.3 2022-10-07 [1] CRAN (R 4.1.3) #> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.1.3) #> Matrix 1.5-3 2022-11-11 [1] CRAN (R 4.1.3) #> munsell 0.5.0 2018-06-12 [1] CRAN (R 4.0.0) #> pillar 1.8.1 2022-08-19 [1] CRAN (R 4.1.3) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.0.0) #> purrr 1.0.1 2023-01-10 [1] CRAN (R 4.1.3) #> R.cache 0.16.0 2022-07-21 [1] CRAN (R 4.1.3) #> R.methodsS3 1.8.2 2022-06-13 [1] CRAN (R 4.1.3) #> R.oo 1.25.0 2022-06-12 [1] CRAN (R 4.1.3) #> R.utils 2.12.2 2022-11-11 [1] CRAN (R 4.1.3) #> R6 2.5.1 2021-08-19 [1] CRAN (R 4.1.1) #> reprex 2.0.2 2022-08-17 [1] CRAN (R 4.1.3) #> rlang 1.1.1 2023-04-28 [1] CRAN (R 4.1.2) #> rmarkdown 2.23 2023-07-01 [1] CRAN (R 4.1.2) #> rstudioapi 0.14 2022-08-22 [1] CRAN (R 4.1.3) #> scales 1.2.1 2022-08-20 [1] CRAN (R 4.1.3) #> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.1.3) #> stringi 1.7.6 2021-11-29 [1] CRAN (R 4.1.2) #> stringr 1.5.0 2022-12-02 [1] CRAN (R 4.1.3) #> styler 1.8.1 2022-11-07 [1] CRAN (R 4.1.3) #> survival * 3.4-0 2022-08-09 [1] CRAN (R 4.1.3) #> tibble 3.1.8 2022-07-22 [1] CRAN (R 4.1.3) #> tidyr 1.3.0 2023-01-24 [1] CRAN (R 4.1.3) #> tidyselect 1.2.0 2022-10-10 [1] CRAN (R 4.1.2) #> utf8 1.2.3 2023-01-31 [1] CRAN (R 4.1.3) #> vctrs 0.6.3 2023-06-14 [1] CRAN (R 4.1.2) #> withr 2.5.0 2022-03-03 [1] CRAN (R 4.1.2) #> xfun 0.37 2023-01-31 [1] CRAN (R 4.1.3) #> yaml 2.3.7 2023-01-23 [1] CRAN (R 4.1.3) #> #> [1] C:/Program Files/R/R-4.1.2/library #> #> ------------------------------------------------------------------------------ ```
larmarange commented 7 months ago

Sorry. There was an error in the implementation. Could you check the last version. Now exponentiate is optionnal.

jalavery commented 7 months ago

Success! Thank you very much.