ewenharrison / finalfit

Quickly create elegant regression results tables and plots when modelling in R
https://finalfit.org/
Other
268 stars 29 forks source link

Restricted cubic splines in finalfit #63

Closed martinschlund closed 3 years ago

martinschlund commented 3 years ago

Hi

Once again, thank you for a great package! A question: is there a way to include splines like with the rcs() from the rms-package with finalfit? Below is an example of what I tried, the output from finalfit is not ideal though, so a suggestion for a workaround is much appreciated.

library(tidyverse)
library(rms)
library(survival)
library(riskRegression)
library(sjPlot)

data("Melanoma")

Melanoma <- Melanoma %>% 
  mutate(statusDSS = if_else(status == 1, 1, 0),#create disease-specific survival
         ageSpline = rcs(age) #create splines in age-variabel
         )

#do usual cox-regression with splines

coxAgeUni <- cph(Surv(time, statusDSS)~rcs(age), data = Melanoma)
coxSexUni <- cph(Surv(time, statusDSS)~sex, data = Melanoma)
coxMulti <- cph(Surv(time, statusDSS)~rcs(age)+sex, data = Melanoma)

#combine to tabel with tab_model from sjPlot: 
tab_model(coxAgeUni, coxSexUni, coxMulti) #this however does not do one column 
#for all coef from univariable analysis, which is annoying

#Try finalfit-way: 

#define explanatory and dependent

explanatory = c('ageSpline', 'sex')
dependent = 'Surv(time, statusDSS)'

#plug into finalfit:
Melanoma %>% 
  finalfit(dependent, explanatory) #the coefs here are the same as from the 'usual' regressions 
#so finalfit seems to do the right thing, the output is not great though...

Best regards, Martin

ewenharrison commented 3 years ago

Thanks Martin.

The short answer is that splines are not currently supported.

finalfit takes a summary table and joins it to a regression output. So it falls down when a summary table doesn't match the regression model output.

However, you could do this.

library(tidyverse)
library(finalfit)
library(riskRegression)
library(rms)

# Get data
data("Melanoma")

# Recode variables
Melanoma <- Melanoma %>% 
  mutate(statusDSS = if_else(status == 1, 1, 0),#create disease-specific survival
         age = ff_label(age, "Age (years)"),
         sex = ff_label(sex, "Sex")
  )

# Define regression parameters
explanatory = c("rcs(age)", 'sex')
dependent = 'Surv(time, statusDSS)'

# Uni
coxUni <- Melanoma %>% 
  coxphuni(dependent, explanatory) %>% 
  fit2df(estimate_suffix = " (univariable)")

# Multi
coxMulti <- Melanoma %>% 
  coxphmulti(dependent, explanatory) %>% 
  fit2df(estimate_suffix = " (multivariable)")

# Bring together
full_join(coxUni, coxMulti) %>% 
  dependent_label(Melanoma, dependent)

Joining, by = "explanatory"
    explanatory            HR (univariable)         HR (multivariable)
    rcs(age)age   0.98 (0.92-1.05, p=0.575)  0.99 (0.92-1.06, p=0.697)
   rcs(age)age'   1.12 (0.87-1.45, p=0.379)  1.09 (0.84-1.42, p=0.520)
  rcs(age)age''   0.47 (0.09-2.40, p=0.363)  0.57 (0.11-3.00, p=0.504)
 rcs(age)age''' 5.31 (0.23-121.49, p=0.296) 3.60 (0.15-86.72, p=0.429)
        sexMale   1.94 (1.15-3.26, p=0.013)  1.70 (0.99-2.90, p=0.052)

You may be happy to adjust this dataframe directly and export in the standard manner.

The following is convoluted but may give you some ideas of what you could do. You need to run it row by row and edit to apply it to your own data.

Melanoma %>% 
  summary_factorlist(dependent, explanatory, fit_id = TRUE) %>% # summary table
  slice(-n()) %>%                                               # remove the empty rcs term which last row
  add_row(
    label = c("age", "age'", "age''", "age'''"),                # add labels for the splines as you wish
    fit_id = coxUni$explanatory[1:4],                           # include the regression labels for the spline terms
    .after = 1
    ) %>% 
  mutate(index = 1:n()) %>%                                     # Adjust row ordering variable
  ff_merge(coxUni) %>% 
  ff_merge(coxMulti, last_merge = TRUE) %>% 
  mutate_at(vars(levels, all), ~ifelse(is.na(.), "", .)) %>%    # Hack to change missing values to empty cells. 
  dependent_label(Melanoma, dependent)                          # Add dependent variable as header if desired

 Dependent: Surv(time, statusDSS)                   all            HR (univariable)         HR (multivariable)
                      Age (years) Mean (SD) 52.5 (16.7)                           -                          -
                              age                         0.98 (0.92-1.05, p=0.575)  0.99 (0.92-1.06, p=0.697)
                             age'                         1.12 (0.87-1.45, p=0.379)  1.09 (0.84-1.42, p=0.520)
                            age''                         0.47 (0.09-2.40, p=0.363)  0.57 (0.11-3.00, p=0.504)
                           age'''                       5.31 (0.23-121.49, p=0.296) 3.60 (0.15-86.72, p=0.429)
                              Sex    Female  126 (61.5)                           -                          -
                                       Male   79 (38.5)   1.94 (1.15-3.26, p=0.013)  1.70 (0.99-2.90, p=0.052)
martinschlund commented 3 years ago

Thank you, that is great! Exactly what I was looking for.

Best regards

Martin