Closed CamDavidsonPilon closed 3 years ago
I've created tests for the Nelson-Aalen, Kaplan-Meier, and Fleming-Harrington, and was working on the Turnbull, and my first step was to plagiarise lifelines.
I did some looking and I can't convince myself that the NPMLE implementation in lifelines is correct. Am I missing something?
If we take the midpoint for the test example in lifelines and treat it like a non-intervally censored example we get substantialy different results:
import surpyval as surv
import numpy as np
left = [1, 8, 8, 7, 7, 17, 37, 46, 46, 45]
right = [7, 8, 10, 16, 14, np.inf, 44, np.inf, np.inf, np.inf]
x = np.vstack([left, right])
c = (~np.isfinite(x[1, :])).astype(int)
x = np.where(np.isfinite(x.mean(axis=0)), x.mean(axis=0), x[0, :])
print(surpyval.KaplanMeier.fit(x=x, c=c).R)
"""
[0.9 0.8 0.7 0.6 0.5 0.5 0.375 0.375 0.375]
"""
The lifelines test compares this to:
[0.16667016, 0.33332984, 0.125, 0.375]
And it's inverse is:
[0.83332984, 0.66667016, 0.875 , 0.625 ]
The solution that lifelines compares to doesn't align to the Kaplan-Meier solution if the midpoints are used. The lifelines npmle seems to have the correct value of the survival function as the CDF for the last value. Am I getting something wrong?
Can you share the lifelines code used?
This is the test in lifelines:
def test_npmle():
left, right = [1, 8, 8, 7, 7, 17, 37, 46, 46, 45], [7, 8, 10, 16, 14, np.inf, 44, np.inf, np.inf, np.inf]
npt.assert_allclose(npmle(left, right, verbose=True)[0], np.array([0.16667016, 0.33332984, 0.125, 0.375]), rtol=1e-4)
So npmle
is a low-level function, and it needs to be paired with reconstruct_survival_function
:
from lifelines.fitters.npmle import *
results = npmle(left, right)
sf = reconstruct_survival_function(*results)
This is equivalent to the (high-level API) KaplanMeierFitter().fit_interval_censoring(left, right)
Thanks, that makes more sense, and sorry to be a pest, but I've implemented surpyval differently to avoid the following scenario.
alpha = 20
beta = 15
x = np.random.weibull(beta, 100) * alpha
n, xx = np.histogram(x, bins=20)
x = np.vstack([xx[0:-1], xx[1::]])
kmf.fit_interval_censoring(x[0, :]+1e-12, x[1, :]).plot()
ax = plt.gca()
kmf.fit_interval_censoring(x[0, :], x[1, :]).plot(color='red', ax=ax)
This results in catastrophic differences in the estimates. It seems that the left edge of some intervals are considered to be failing in the previous interval. This is behaviour that I wouldn't expect and therefore comparisons between lifelines and surpyval are different.
Are you sure the Turnbull intervals in lifelines are correct?
Tests for Kaplan-Meier, Nelson-Aalen, and Fleming-Harrington included in f3a264a955aed0fe563dc912c07e81ceb2b12666.
Still working on Turnbull tests as per the thread above.
Are you sure the Turnbull intervals in lifelines are correct?
No lol - I only know they are correct up-to the unit tests in lifelines.
The data you presented is an interesting case I'll have to explore more. I would suggest comparing results to icenreg
in R if you can. (Seeing your written tests now - I suggest comparing vs R's implementations as well as lifelines)
Tests for Turnbull complete agains R's icenReg
. Complete's nonparametric tests. See 4086c12b8a6ccc225cfab3eee4450f8a549a6510
No lol - I only know they are correct up-to the unit tests in lifelines.
No probs. SurPyval doesn't actually find the intervals, it just creates an interval for between every observed value and every truncated value and brute forces the solution. This implementation will result in a few additional calculations but it makes the code more readable and was easier to develop.
The KM is by far the most widely used model, and most important non-parametric model. I'd like to see sufficient tests for it included.
https://github.com/openjournals/joss-reviews/issues/3484