Suggestion: Multiple Kaplan Meier confidence interval options

Greenwood CI for KM estimator does not work very well with highly censored data and also has asymptotic properties, so doesn't work well for small sample sizes. After playing around with lifelines and getting many NaNs for CIs of highly censored data, I ended up using "Beta Product Confidence Procedure for Right Censored Data" from the following paper: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3769999/

There is an R package to calculate this (which also includes four other CIs): https://cran.r-project.org/web/packages/bpcp/index.html

I'm a Python user so I wrapped this in rpy2. I understand that introduction of new libraries, wrappers or statistical methods may not be appealing, but just checking if there is any interest to incorporate this or any other CIs into lifelines?

Here's a sample of code that can create these CIs:

from rpy2.robjects import pandas2ri,r
import rpy2.robjects.packages as rpackages
from rpy2.robjects.vectors import StrVector
from rpy2.rinterface import RRuntimeError

utils = rpackages.importr('utils')
utils.chooseCRANmirror(ind=1);
packnames = ['bpcp']
names_to_install = [x for x in packnames if not rpackages.isinstalled(x)]
if len(names_to_install) > 0:
    utils.install_packages(StrVector(names_to_install))

pandas2ri.activate()

bpcp = rpackages.importr('bpcp')

def bpcp_survival(df,days_to_succeed = 5*365.): 
    try:
        bfit = bpcp.bpcp(df['TherapyToLastDay'],df['Vital Status'],Delta=0);
    except RRuntimeError:
        return pd.Series(index=['survival','lower','upper'])
    surv_results = pandas2ri.ri2py(bpcp.StCI(bfit,days_to_succeed))
    return surv_results[['survival','lower','upper']].iloc[0]

CamDavidsonPilon / lifelines

Suggestion: Multiple Kaplan Meier confidence interval options #424