Open thbuerg opened 4 years ago
So, the main reason we did it was to follow the DeepSurv implementation. It is not really the partial log-likelihood, but an approximation that is simpler to implement. The approximation is correct as long as there are no tied observation times (which is generally a problem in Cox regression), and with few tied events the approximation should be fine. When there are tied observations, only one of them will get the correct likelihood contribution. Although there is no formal theoretical justification for this approximation, the fact that we are computing it over random subsets (batches used for SGD) makes the ordering of the tied observations random, which should decrease the bias. Through empirical studies we have also found that the approximation has little effect on prediction (though we have not really studied the estimated parameters). A formal study of this would have been interesting as a contribution to the literature on the Cox partial likelihood with tied event times.
Thank you for the question, as I think it is an important one, and I hope the answer was satisfactory.
I think a good contribution to this package would be to add loss function for the Cox partial log-likelihoods with Breslow and Effron approximations for the tied events.
Thanks for your quick (and satisfactory) reply. Basically this means that as long as the rate of ties is low everything should be fine. Should I reimplement the loss w/ Breslow/Effron I will definitely let you know and/or open a PR.
Thanks!
Hi,
many thanks for implementing this easy to use and flexible package. I have a short question regarding the implementation of the partial likelihood function of deepsurv (
cox_ph_loss_sorted
). In the paper is a cumulative sum over the risk sets, in your implementation however this is approximated by taking the sum over all ranked samples.Could you comment on why the approximation in
cox_ph_loss_sorted
is legit? What is the reasoning behind it?This would help a lot! Thanks!