CamDavidsonPilon / lifelines

Survival analysis in Python
lifelines.readthedocs.org
MIT License
2.34k stars 554 forks source link

Plot survival with time-varying covariates #530

Open arugola opened 5 years ago

arugola commented 5 years ago

When using CoxTimeVaryingFitter, is there a function for plotting the survival function (given the values of the covariates)? If not, what would be the easiest way to obtain it?

CamDavidsonPilon commented 5 years ago

Hi @arugola,

It's not clear what the survival function should be plotting, given time-varying (tv) covariates. Consider that the survival function measures points after some baseline measurement, so we have two options:

1) we know the covariates for the subject in the future, so we can modify the survival function - but if we know the covariates of the subject in the future, we know if they died or not!

2) we don't know the covariates in the future, so we don't modify the survival function, but this is incomplete because we know the covariates should change.

CamDavidsonPilon commented 5 years ago

Suffice to say, there are fundamental problems with time varying covariates and prediction.

arugola commented 5 years ago

Thanks for the prompt reply. What I meant to say is the following: let's say we have collected the data up to time t, where the covariates may change over time. Wouldn't it be possible to plot the survival curve for all subjects that are alive at time t?

GCBallesteros commented 3 years ago

Hi,

I'm working on a problem where I would be able to predict future values of the covariates. Of course, I wouldn't know if the event has happened. In particular, I want to add weather data (outside temperature) as a covariate. I can predict this to a reasonable level of accuracy by just using the average values from previous years.

The question is then: Can CoxTimeVaryingFitter be used for prediction if we somehow have access to an oracle that gives us future values of the time dependent covariates?

Another use case would be to build survival curves based on hypothetical future scenarios.

I have tried the following but I'm pretty sure I'm doing something wrong. If somebody has any ideas on how to proceed I'll be forever grateful.

# Xs contains all the time ordered observations for a subject including the time-dependent
# covariate observations generated by the oracle at times `ts`.
partial_hazards = ctv.predict_partial_hazard(Xs)

# Next we need baseline_hazard. A simple interpolant is used to
# get "in between" values if required.
#
# Also take np.diff because the model only contains the baseline_cumulative_hazard
# and we need the baseline_hazard.
from scipy.interpolate import interp1d

# ad hoc cheat to make sure the baseline at time zero is equal to zero
aux_baseline = np.zeros(len(ctv.baseline_cumulative_hazard_))
aux_baseline[1:] = np.diff(ctv.baseline_cumulative_hazard_["baseline hazard"].to_numpy())

baseline_hazard = interp1d(
    ctv.baseline_cumulative_hazard_.index - 1, # -1 because index starts at 1 and we want to force b(t=0) = 0
    aux_baseline,
    fill_value="extrapolate"
)

# Finally we can plot the survival function by integrating up to each time
# t via cumsum
dt = ts[1] - ts[0]
plt.plot(
    ts,
    np.exp(-np.cumsum(baseline_hazard(ts) * partial_hazards) * dt)
)

edit: ad hoc idea to force b(t=0) = 0

physics39 commented 3 years ago

Hello,I have the same problem with you! I also want to get the survival probablility by the CoxTimeVaryingFitter package.Could you please tell me how do you solve the problem?Thank you very much.

SimplyOm commented 1 year ago

Hi, I am having the same use case as @GCBallesteros. Did the above approach work for you, or what was your solution?

Although I understand the point from @CamDavidsonPilon on using time varying covariates leading to survivorship bias, it still has use cases where we want to run a what-if analysis on projected future values of covariates. As such, is there any plan to include this in a future release, or is this use case beyond the scope of the library? Would have to switch to R survival package for this reason.