theislab / ehrapy

Electronic Health Record Analysis with Python.
https://ehrapy.readthedocs.io/
Apache License 2.0
232 stars 19 forks source link

Multivariate and regularized CoxPH models #744

Open mhaist94 opened 5 months ago

mhaist94 commented 5 months ago

Description of feature

Hi all,

thanks for taking your time digging through my requests: Another important points (at least for clinicians, I will get to some more biology-focused features later) would be the implementation of multivariate CoxPH models that take into account multiple variables that might affect your time-to-event endpoint. One of those (that is though limited in terms of the model fitting accuracy, and thus should not be employed with more than 5 variables at a time) is described in the SurvivalAnalysis (analyse_multivariate function, see https://cran.r-project.org/web/packages/survivalAnalysis/vignettes/multivariate.html) package in R or again in the survminer package in R (here its a coxph function if I recall correctly). To fit more parameters into a multivariate model (which is rarely done), one usually employs regularized CoxPH models (for example LASSO regularized models, which is implmented in the glmnet package in R, see https://glmnet.stanford.edu/articles/glmnet.html). This is not a must-have - but given that the utility of ehrapy is particularly in the dissection of big heterogeneous datasets regularized models might be an idea worthwhile considering.