Yet another Python survival analysis tool.
This is another pure python survival analysis tool so why was it needed? The intent of this package was to closely mimic the scipy API as close as possible with a simple .fit()
method for any type of distribution (parametric or non-parametric); other survival analysis packages don't completely mimic that API. Further, there is currently (at the time of writing) no pacakage that can take an arbitrary comination of observed, censored, and truncated data. Finally, surpyval is unique in that it can be used with multiple parametric estimation methods. This allows for an analyst to determine a distribution for the parameters if another method fails. The parametric methods available are Maximum Likelihood Estimation (MLE), Probability Plotting (MPP), Mean Square Error (MSE), Method of Moments (MOM), and Maximum Product of Spacing (MPS). Surpyval can, for each type of estimator, take the following types of input data:
Method | Para/Non-Para | Observed | Censored | Truncated |
---|---|---|---|---|
MLE | Parametric | Yes | Yes | Yes |
MPP | Parametric | Yes | Yes | Limited |
MSE | Parametric | Yes | Yes | Limited |
MOM | Parametric | Yes | No | No |
MPS | Parametric | Yes | Yes | No |
Kaplan-Meier | Non-Parametric | Yes | Right only | Left only |
Nelson-Aalen | Non-Parametric | Yes | Right only | Left only |
Fleming-Harrington | Non-Parametric | Yes | Right only | Left only |
Turnbull | Non-Parametric | Yes | Yes | Yes |
SurPyval also offers many different distributions for users, and because of the flexible implementation adding new distributions is easy. Further, the power of SurPyval lay in the robust parameter estimation, as such, some distributions, those that are supported on the half real line, can be offset to make a three- or four-parameter version. The currently available distributions are:
Distribution | Offsetable |
---|---|
Weibull | Yes |
Normal | No |
LogNormal | Yes |
Gamma | Yes |
Beta | No |
Uniform | No |
Exponential | Yes |
Exponentiated Weibull | Yes |
Gumbel | No |
Logistic | No |
LogLogistic | Yes |
This project spawned from a Reliaility Engineering project; due to the history of reliability engineers estimating parameters from a probability plot. SurPyval has continued this tradition to ensure that any parametric distribution can have the estimate plotted on a probability plot. These visualisations enable an analyst to get a sense of the goodness of fit of the parametric distribution with the non-parametric distribution.
SurPyval can be installed via pip using the PyPI repository
pip install surpyval
If you're familiar with survival analysis, and Weibull plotting, the following is a quick start.
from surpyval import Weibull
from surpyval.datasets import BoforsSteel
# Fetch some data that comes with SurPyval
data = BoforsSteel.df
x = data['x']
n = data['n']
model = Weibull.fit(x=x, n=n, offset=True)
model.plot();
SurPyval is well documented, and improving, at the main documentation.
pip install -r requirements_dev.txt
Run the testing suite by simply executing:
pytest
or use coverage to get a coverage report:
coverage run -m pytest # Run pytest under coverage's watch
coverage report # Print coverage report
coverage html # Make a html coverage report (really useful), open htmlcov/index.html
pre-commit
(it's in requirements_dev.txt
anyways)pre-commit install
which sets up the git hook scriptspre-commit run --all-files
to run the hooks on all filesEmail derryn if you want any features or to see how SurPyval can be used for you.