jwb133 / smcfcs

R package implementing Substantive Model Compatible Fully Conditional Specification Multiple Imputation
11 stars 7 forks source link

support for coxph with strata #16

Open vanAmsterdam opened 3 years ago

vanAmsterdam commented 3 years ago

Hi, thanks for making this code public

I need to work with a coxph model with stratified baseline hazard Due to the calculation of the log-hazard ratio inside smcfcs this goes wrong when strata are involved. The strata do not contribute to the log-hazard ratio, but the current code assumes that all terms do and throws a non-conformable arguments error in the matrix multiplication

I've come up with two solutions:

  1. write a function that detects strata arguments in the formula definition and throws them out at the right place
  2. use the built in predict.coxph method using type='lp'

I think option 2. is more general and I couldn't find a reason not to use this

library(smcfcs)
library(survival)
data1=data.frame(
  time = abs(rnorm(100)),
  event = rbinom(100, size=1, prob=0.5),
  x = rnorm(100),
  s = rbinom(100, size=1, prob=0.5)
)

coxph(Surv(time,event)~x+strata(s), data=data1)

data2 <- data1
data2[1:10, 'x'] <- NA

smcfcs(data2, smtype = 'coxph', smformula = 'Surv(time,event)~x+strata(s)',
       method=c('', '', 'norm', ''))

I've implemented this solution and created a PR

vanAmsterdam commented 3 years ago

BTW this is solution 1.

filter_strata <- function(smformula) {
  rhs <- gsub(x = smformula, pattern = ".*~", replacement = "")

  smformula_matrix <- as.formula(paste0("~ +", rhs))

  # Check if there is stratification - if so remove from model matrix
  # (stratification means different baseline hazards, coefficients still same)
  if (grepl(x = rhs, pattern = "strata")) {
    strata_var <- stringr::str_replace_all(rhs, pattern = ".*\\(|\\).*", replacement = "")
    rm_strata <- as.formula(paste0("~ . - strata(", strata_var, ")"))
    smformula_matrix <- update(smformula_matrix, rm_strata)
  }

  return(smformula_matrix)
}

Then at the 3 places where this is relevant:

smformula_matrix <- filter_strata(smformula)
outmodxb <-  model.matrix(as.formula(smformula_matrix),imputations[[imp]])
outmodxb <- as.matrix(outmodxb[,2:dim(outmodxb)[2]]) %*% as.matrix(outcomeModBeta)