Closed Jesse-Islam closed 5 years ago
The following is my attempt at coding a breslow estimator that can handle any number of variables.. still buggy based on output.
library(survival)
library(casebase)
#> See example usage at http://sahirbhatnagar.com/casebase/
library(glmnet)
#> Warning: package 'glmnet' was built under R version 3.6.1
#> Loading required package: Matrix
#> Loading required package: foreach
#> Warning: package 'foreach' was built under R version 3.6.1
#> Loaded glmnet 2.0-18
#process for creating a manual absolute risk function
data=ERSPC
u=Surv(time = data$Follow.Up.Time, event = data$DeadOfPrCa)
xa=as.data.frame(data[,1])
xa$normal=rnorm(length(xa[,1]),mean=0,sd=6)
xa$wernormal=rnorm(length(xa[,1]),mean=5,sd=5)
x=as.matrix(xa)
coxfit=cv.glmnet(x=x,y=u, family="cox",alpha=0)
coxcoef=coef(coxfit)
betaHat=coxcoef@x
tab <- data.frame(table(data[data$DeadOfPrCa == 1, "Follow.Up.Time"]))
y <- as.numeric(levels(tab[, 1]))[tab[, 1]] #ordered distinct event times
d <- tab[, 2]
h0 <- rep(NA, length(y))
for(l in 1:length(y))
{
h0[l] <- d[l] / sum(exp(x[data$Follow.Up.Time>=y[l],] %*% betaHat))
}
plot(h0,type="l")
Created on 2019-08-11 by the reprex package (v0.3.0)
I think we can use the survival::survfit
function to get the survival function as follows:
library(survival)
library(glmnet)
#> Loading required package: Matrix
#> Loading required package: foreach
#> Loaded glmnet 2.0-18
load(paste0(find.package("glmnet"), "/data/CoxExample.RData"))
fit <- cv.glmnet(x, y, family = "cox")
nonzero_covariate <- predict(fit, type = "nonzero", s = "lambda.1se")
nonzero_coef <- coef(fit, s = "lambda.1se")
# Fit Cox with selected covariates
fit_cox <- coxph(Surv(time, status) ~ .,
data = as.data.frame(cbind(y, x[,nonzero_covariate$X1])))
# Change coefficients from fit_cox to those of fit
fit_cox2 <- fit_cox
fit_cox2$coefficients <- nonzero_coef@x
# Plot survival functions
plot(survfit(fit_cox))
lines(survfit(fit_cox2), col = 'blue',
conf.int = FALSE)
Created on 2019-08-13 by the reprex package (v0.3.0)
Essentially, we fit coxnet
, we get the selected coefficients and their corresponding estimates. Using only those covariates, we fit a Cox model and then replace the estimated coefficients by their glmnet
estimates. In the graph above, we get the Cox regression curve in black (with 95% confidence interval) and the Coxnet curve in blue.
However, we can't really plot the confidence interval because we have the "wrong" covariance matrix.
The following is where I am at while trying to get a direct comparison out of the survival package. I can use glmnet on a Surv object, effectively fitting the hazard with glmnet, however, getting an absolute risk out of it is a less obvious task. There is a line of #'s at the two main points of interest, while the rest is just what is needed.
Created on 2019-08-11 by the reprex package (v0.3.0) `
Created on 2019-08-11 by the reprex package (v0.3.0)