jkropko / coxed

Duration-Based Quantities of Interest and Simulation Methods for the Cox Proportional Hazards Model
23 stars 6 forks source link

sim.survdata() with censoring #9

Open jjharden opened 3 years ago

jjharden commented 3 years ago

From a user:

When I generate data with your sim.survdata function, why does the estimated survival function (Cox) change, once I introduce censoring? I believe that the true values, i.e. the survival function should be unaffected by censoring. I have set up a small example below where I only use a binary regressor, time constant coefficients and a constant baseline hazard. I have estimated the coefficients using a cox model and then predicted the survival function for X=0.

rm(list = ls()) set.seed(12345) hazfun <- function(t){0.01} T <- 1000 n <- 10000 x <- rbinom(n, size=1, prob=0.5) b <- .5 data <- sim.survdata(T=T, N=n, type="none", beta=b, X=as.data.frame(x), num.data.frames=1, censor=0, hazard.fun=hazfun) data_c <- sim.survdata(T=T, N=n, type="none", beta=b, X=as.data.frame(x), num.data.frames=1, censor=0.3, hazard.fun=hazfun) predx <- as.data.frame(t(c(0))) cox1 <- coxph(Surv(y, failed) ~ x, data=data$data) cox2 <- coxph(Surv(y, failed) ~ x, data=data_c$data) names(predx) <- names(cox1$coef) cox_fit1 <- survfit(cox1,newdata = predx) cox_fit2 <- survfit(cox2,newdata = predx)

plot(data$baseline$time,data$baseline$survivor, type="l", col="black") lines(data_c$baseline$time,data_c$baseline$survivor, col="blue") lines(cox_fit1$time, cox_fit1$surv, col="red") lines(cox_fit2$time, cox_fit2$surv, col="orange") legend("topright", c("Base SF: No C", "Base SF: C", "SF: No C (X=0)", "SF: C (X=0)"), col = c("black", "blue", "red", "orange"), lty=1, lw=2)

jjharden commented 3 years ago

My initial thought is that the difference you see is due to more observations surviving (by virtue of being randomly selected for censoring) in data_c compared to data. The two curves are the same shape, but the one with censoring is shifted up, indicating a higher rate of survival. All of the censored cases survive through the whole observation period.