CamDavidsonPilon / lifelines

Survival analysis in Python
lifelines.readthedocs.org
MIT License
2.35k stars 557 forks source link

Option for cumulative failure #288

Closed camdencheek closed 7 years ago

camdencheek commented 7 years ago

Due to low failure rates, for a number of situations it is more useful to plot cumulative failure than the survival curve for a Kaplan Meier plot. Though this is not the standard, it has become common for things like arthroplasty revision rates, and it would be convenient to have that option.

CamDavidsonPilon commented 7 years ago

Hey @ccheek21, how do you define cumulative failure?

camdencheek commented 7 years ago

Sorry it took so long to respond, didn't see the notification. Really, all it is is the survival plot, but upside down. So, just one minus each of the data points, or just the cumulative distribution function of failure. I attempted to do it myself, and it sort of worked, but I keep running into a confidence interval issue. I'll see if I can get it working, then I'll submit a pull, but it should be a pretty simple option to add.

camdencheek commented 7 years ago

Okay, so looking further into the code, I found that the cumulative_density_ estimate name is behind the left_censorship flag. However, I'm interested in plotting the cumulative density with right censorship. Is there a reason that I'm missing why that isn't an option?

CamDavidsonPilon commented 7 years ago

Does this achieve it?

1 - kmf.survival_function_

Of course, this won't give you the CI. It is left off as a property because 1) it's trivial to implement, 2) not needed as often as the survival function 3) adds more the the API than necessary.

camdencheek commented 7 years ago

Yes, that does what I was looking for. It took me some work to figure out how to preserve the plotting capabilities of lifelines rather than doing the plotting manually, but then I realized I could just do the operation on the dataframe itself. I've included the code for anyone looking for it in the future.


kmf.survival_function_ = 1 - kmf.survival_function_
kmf.confidence_interval_ = 1 - kmf.confidence_interval_
kmf.plot()