Closed rjjanse closed 4 months ago
Frankly, I've never found a use for predict(fit, type='surv'). I should perhaps remove that option, as you are not the first person to be confused. BTW, in your data set subject 127 is censored on day 92; the two types of prediction do not agree for this subject either.
When predicting survival probabilities using predict() on a fitted Cox model, the predicted survival probabilities for individuals with the event of interest are wrong.
Individual linear predictors are all correct. Given that the baseline hazard cannot differ between individuals, it is curious that the predicted risks are correct for individuals without the event but incorrect for individuals with the event.
From the details of predict.coxph, we read that the survival probability is not calculated using the linear predictor but the expected number of events: The survival probability for a subject is equal to exp(-expected).
I have an example based on a simple model with only age. I check the predictions manually and with the {riskRegression} package:
The first 10 rows of the comparison are as follows:
Here, predict is the predict.coxph method from {survival}, predictCox comes from {riskRegression}, and manual represents my calculations by hand. The linear predictor in lp_predict comes from predict.coxph again and lp_manual was calculated by hand.
We can see that predict does not match predictCox and manual when
status == "dead"
, but they are all equal whilestatus == "censored"
. The linear predictor is correct regardless of the event.For information my
sessionInfo()
: