derrynknife / SurPyval

A Python package for survival analysis. The most flexible survival analysis package available. SurPyval can work with arbitrary combinations of observed, censored, and truncated data. SurPyval can also fit distributions with 'offsets' with ease, for example the three parameter Weibull distribution.
https://surpyval.readthedocs.io/en/latest/index.html
MIT License
47 stars 5 forks source link
actuarial-science churn-prediction non-parametric parametric-distribution parametric-methods probability-plot reliability reliability-analysis reliability-engineering risk-analysis survival-analysis weibull
surpyval logo

SurPyval - Survival Analysis in Python

actions PyPI version PyPI - Python Version Documentation Status DOI

Yet another Python survival analysis tool.

This is another pure python survival analysis tool so why was it needed? The intent of this package was to closely mimic the scipy API as close as possible with a simple .fit() method for any type of distribution (parametric or non-parametric); other survival analysis packages don't completely mimic that API. Further, there is currently (at the time of writing) no pacakage that can take an arbitrary comination of observed, censored, and truncated data. Finally, surpyval is unique in that it can be used with multiple parametric estimation methods. This allows for an analyst to determine a distribution for the parameters if another method fails. The parametric methods available are Maximum Likelihood Estimation (MLE), Probability Plotting (MPP), Mean Square Error (MSE), Method of Moments (MOM), and Maximum Product of Spacing (MPS). Surpyval can, for each type of estimator, take the following types of input data:

Method Para/Non-Para Observed Censored Truncated
MLE Parametric Yes Yes Yes
MPP Parametric Yes Yes Limited
MSE Parametric Yes Yes Limited
MOM Parametric Yes No No
MPS Parametric Yes Yes No
Kaplan-Meier Non-Parametric Yes Right only Left only
Nelson-Aalen Non-Parametric Yes Right only Left only
Fleming-Harrington Non-Parametric Yes Right only Left only
Turnbull Non-Parametric Yes Yes Yes

SurPyval also offers many different distributions for users, and because of the flexible implementation adding new distributions is easy. Further, the power of SurPyval lay in the robust parameter estimation, as such, some distributions, those that are supported on the half real line, can be offset to make a three- or four-parameter version. The currently available distributions are:

Distribution Offsetable
Weibull Yes
Normal No
LogNormal Yes
Gamma Yes
Beta No
Uniform No
Exponential Yes
Exponentiated Weibull Yes
Gumbel No
Logistic No
LogLogistic Yes

This project spawned from a Reliaility Engineering project; due to the history of reliability engineers estimating parameters from a probability plot. SurPyval has continued this tradition to ensure that any parametric distribution can have the estimate plotted on a probability plot. These visualisations enable an analyst to get a sense of the goodness of fit of the parametric distribution with the non-parametric distribution.

Install and Quick Intro

SurPyval can be installed via pip using the PyPI repository

pip install surpyval

If you're familiar with survival analysis, and Weibull plotting, the following is a quick start.

from surpyval import Weibull
from surpyval.datasets import BoforsSteel

# Fetch some data that comes with SurPyval
data = BoforsSteel.df

x = data['x']
n = data['n']

model = Weibull.fit(x=x, n=n, offset=True)
model.plot();

Weibull Data and Distribution

Documentation

SurPyval is well documented, and improving, at the main documentation.

Development

Dependencies

pip install -r requirements_dev.txt

Testing

Run the testing suite by simply executing:

pytest

or use coverage to get a coverage report:

coverage run -m pytest  # Run pytest under coverage's watch
coverage report         # Print coverage report
coverage html           # Make a html coverage report (really useful), open htmlcov/index.html

Pre-commit

Contact

Email derryn if you want any features or to see how SurPyval can be used for you.