Closed lucharo closed 3 years ago
This is one of those "I'll answer it tomorrow" questions that I then forgot. The synopsis:
Let's have a conversation. therneau@mayo.edu
A few more details.
Penalized models often have a lot of variables, in fact 100s of variables is one of the common motivations. Any code for general penalized should expect this case. It is also true that the design matrix X will be normally be sparse. This means that sparse matrix computation should be built in from the start, it is critical for decent speed.
For a penalized Cox model, there are efficient ways to get the linear predictor X beta, the log partial likelihood (LPL), and the first derivative of the LPL using sparse methods. But not the second derivative, that is a bear.
The current code has a subtle method for fast second derivatives, but it only works for one particular case (a random intercept per group). That leads to somewhat schizophrenic code: for some problems coxme is blazingly fast, for most it is deathly slow. The trick will not extend to complex models.
One promising path forward is to replace Newton-Raphson iteration is a method that only needs first derivatives. My vote would be Hamiltonian MCMC. Cox models have a very quadratic loglik function and I think it will work very well. Where can we borrow/steal a well worked out Hamiltonian library? Contrary to popular belief, you don't need Baysian priors for Hamiltonian MCMC.
Thank you for your message @therneau! I'll be in touch through e-mail
On this topic, may I ask why there is an implementation of ridge regularization, but not Lasso? I have tried modifying the ridge function by swapping out (similar for the case with scale)
list(penalty= sum(coef^2)*theta/2,
first = theta*coef,
second = theta,
flag=FALSE)
for
sign_coef = sign(coef)
sign_coef[sign_coef == 0] = 1
delat_coef = 1e10 * as.numeric(coef == 0)
list(penalty= sum(abs(coef))*theta,
first = theta*sign_coef,
second = delat_coef ,
flag=FALSE)
but this seems to not push some of the coefficients towards 0, especially not as effectively as glmnet package. I am however limited when using the glmnet package as I would like to cluster individuals, as is possible in the survival package.
The code for penalized models in coxph depends completely on Newton-Raphson maximization; and that approach simply does not work for the lasso. Again, coxme is the place to add this, but I have another year's iteration ahead on the multi-state material in coxph. Only then will I again turn my attention to penalized code.
TL;DR
I would be up for taking on the challenge of enabling Cox regularisation and submitting a pull request for this application, though I may need some help with interpretation of the C code(I have experience it’s C and C++ but I would need someone to tell me where to look at to implement regularisation).
Long version
In one of my latest project, I wanted to use Cox regression with regularisation penalties (LASSO, Ridge or Elastic Net). In my particular application, I wanted to model survival given a start and a stop vector inside the
Surv
object, because of this I could not use thecv.glmnet
function from theglmnet
package (which only accepts Surv objects with a single time vector). Unfortunately, I could not add in a regularisation penalty with thesurvival
package and I had to resort to thepenalized
package. This package allowed me to add most of the functionality from thesurvival
package but its implementation was sometimes different, e.g. when modelling effect of categorical variables and interactions with time, the estimates would differ between thepenalized
and thesurvival, coxph()
implementations.In the end, I performed cross-validation over different regularisation parameter lambda with the
penalized
package and then used the coefficients estimated for the optimal choice of lambda to train a model with thecoxph
function to then evaluate performance with theconcordance
metric.This was good enough for my task but very inefficient (I find the coxph to be one of the fastest implementations of Cox regression and definitely the most versatile). The only thing missing for me it's the ability to add regularisation penalties and perhaps a cross-validation function.
I would be up for taking on the challenge of enabling Cox regularisation and submitting a pull request for this application, though I may need some help with interpretation of the C code(I have experience it’s C and C++ but I would need someone to tell me where to look at to implement regularisation).