therneau / survival

Survival package for R
390 stars 106 forks source link

Incompatibility between predict.coxph and Hmisc::fit.mult.impute #179

Closed JackoBill closed 2 years ago

JackoBill commented 2 years ago

I'm working with an incomplete survival data which I impute using Hmisc::aregImpute and then combine inferences using Hmisc::fit.mult.impute. The latter one returns object of class coxph which raised my hopes of comfortable use of predict.coxph function but seems not to be the case: Error in predict.coxph(fit_spline, newdata, se = T) : Data is not the same size as it was in the original fit occurs.

# Artificial data

set.seed(432908)
time<-(rnorm(n=100,mean=10,sd=4))^2
status<-1*rbinom(n=100,size=1,prob=time/max(time))
bmi<-runif(n=100,min=18+3*status,max=40+3*status)
age<-runif(n=100,min=30-5*status,max=80-5*status)
hospital<-c(rep("bad hospital",50),rep("good hospital",50))

bmi[1:30]<-NA # Some missing values
age[20:40]<-NA

data<-data.frame(time,status,bmi,age,hospital)

library(Hmisc)
# library(survival)

# Imputation

imp<-aregImpute(~time+status+bmi+age+hospital,data=data,n.impute=5,nk=3,x=T)

# Combining inferences + complete-case analysis

fit_imp<-fit.mult.impute(Surv(time,status)~age+pspline(bmi,df=3.5)+strata(hospital),coxph,imp,data=data)
fit_cc<-coxph(Surv(time,status)~age+pspline(bmi,df=3.5)+strata(hospital),data=data)

newd<-data.frame(age=50,bmi=30,hospital="bad hospital")

predict(fit_imp,newd,se=T) # Not working
# Error in predict.coxph(fit_spline, newdata, se = T) : 
#   Data is not the same size as it was in the original fit

predict(fit_cc,newd,se=T) # Working as usual

class(fit_imp)
# [1] "fit.mult.impute" "coxph.penal"     "coxph"

class(fit_cc)
# [1] "coxph.penal" "coxph"

# Non-matching objects

setdiff(names(fit_cc),names(fit_imp))
# [1] "na.action"
setdiff(names(fit_imp),names(fit_cc))
# [1] "variance.inflation.impute" "missingInfo" "dfm"

My ultimate aim is to plot hazard ratio against BMI similar way like here and termplot gave me this error. However, the problem has nothing to do with splines (one could have bmi as itself in the model and error would occur anyway).

I guess this isn't a real failure of predict.coxph (why would two random functions work together nicely?) but I hope there is something to be done.

JackoBill commented 2 years ago

I read some more and a similar problem seem to concern survival curves: https://stats.stackexchange.com/questions/323310/survival-curve-for-cox-regression-of-multiple-imputated-data. It says that 'basically this can't be done directly because the multiply imputed fit is different from a single fit as presented in a normal Cox model'.

So, maybe I'm asking for a solution that does not exist. I had already considered making a grid of BMIs, thinking every point as a parameter and then combining them. I guess I will pursue that direction.

therneau commented 2 years ago
  1. This is not a question about the survival package, but about Hmisc. I would suggest email to the Hmisc author, not me; and email rather than a github issue.
  2. I don't know Hmisc code in detail, and have neither interest nor time to dig deep.
  3. Last, if you want an answer, then take the time to create readable code --- use you space bar and return key!