Closed sisiphos2018 closed 5 years ago
👋 hi there
So lifelines previously had significance stars - the last version that had them was v0.14.6. So you could rollback to that version, but lots of improvements had happened since then so this is my least favourite option. pip install lifelines==v0.14.6
You could write some python code to add them. For example, rewrite the entire print_summary
. (check the stars in significance_code
below since they may not be convention):
def significance_code(p):
if p < 0.001:
return '***'
elif p < 0.01:
return '**'
elif p < 0.05:
return '*'
elif p < 0.1:
return '.'
else:
return ' '
def print_summary(cph, decimals=2, **kwargs): """ Print summary statistics describing the fit, the coefficients, and the error bounds.
Parameters
-----------
decimals: int, optional (default=2)
specify the number of decimal places to show
kwargs:
print additional metadata in the output (useful to provide model names, dataset names, etc.) when comparing
multiple outputs.
"""
# Print information about data first
justify = string_justify(18)
print(cph)
print("{} = '{}'".format(justify("duration col"), cph.duration_col))
if cph.event_col:
print("{} = '{}'".format(justify("event col"), cph.event_col))
if cph.weights_col:
print("{} = '{}'".format(justify("weights col"), cph.weights_col))
if cph.cluster_col:
print("{} = '{}'".format(justify("cluster col"), cph.cluster_col))
if cph.robust or cph.cluster_col:
print("{} = {}".format(justify("robust variance"), True))
if cph.strata:
print("{} = {}".format(justify("strata"), cph.strata))
if cph.penalizer > 0:
print("{} = {}".format(justify("penalizer"), cph.penalizer))
print("{} = {}".format(justify("number of subjects"), cph._n_examples))
print("{} = {}".format(justify("number of events"), cph.event_observed.sum()))
print("{} = {:.{prec}f}".format(justify("log-likelihood"), cph._log_likelihood, prec=decimals))
print("{} = {}".format(justify("time fit was run"), cph._time_fit_was_called))
for k, v in kwargs.items():
print("{} = {}\n".format(justify(k), v))
print(end="\n")
print("---")
df = cph.summary
# Significance codes as last column
df[""] = [significance_code(p) for p in df['p']]
print(df.to_string(float_format=format_floats(decimals), formatters={"p": format_p_value(decimals)}))
# Significance code explanation
print("---")
print("Concordance = {:.{prec}f}".format(cph.score_, prec=decimals))
print(
"Log-likelihood ratio test = {:.{prec}f} on {} df, -log2(p)={:.{prec}f}".format(
*cph._compute_likelihood_ratio_test(), prec=decimals
)
)
3. My preferred solution is to think of another way of expressing significance altogether, without stars. I removed the stars for a reason: https://twitter.com/Cmrn_DP/status/1072951755028103168. I've started including the log2(p) values since they are easier to reason about: https://lesslikely.com/statistics/s-values/
I am trying to set nice regression tables. Several Cox regressions next to each other with significance stars. In Stata I was using estout. In Python I am experimenting with summary_col in statsmodels for OLS.
Is there a good solution for lifelines Cox Regression?
Thank you