CamDavidsonPilon / lifelines

Survival analysis in Python
lifelines.readthedocs.org
MIT License
2.37k stars 560 forks source link

KMF legend labels do not play nice with LaTeX #55

Closed spacecowboy closed 10 years ago

spacecowboy commented 10 years ago

The labels "_upper_0.95" and "_lower_0.95" break plotting if LaTeX is enabled:

import matplotlib as mpl
mpl.rcParams['text.usetex']=True

# Create a kaplan-meier fitter then plot...
kmf.plot(ax=ax, c="#A60628", ci_force_lines=True)

The problem is that LaTeX hates underscores... The text might also look bad if matplotlib interprets the legend as mathmode (not sure if it does). One could imagine something like this:

import matplotlib as mpl

# No more underscores
label = "{} upper 0.95".format(actual_label)
# If TeX, wrap in a text environment
if mpl.rcParams['text.usetex']:
    label = "\text{%s}" % label
CamDavidsonPilon commented 10 years ago

Hm, is this also a Pandas problem? That is, if I have Latex enabled, and underscores in a dataframe's column name, does that break Pandas .plot too?

spacecowboy commented 10 years ago

Yes it would seem that it is. Loading a csv-file with pandas, and calling plot on the dataframe results in:

/usr/local/lib/python3.4/site-packages/IPython/core/formatters.py:239: FormatterWarning: Exception in image/png formatter: LaTeX was not able to process the following string: 'study_id,id'

But the difference is that I can change the column names in the dataframe. The "_upper_0.95" is not as easily changed.

CamDavidsonPilon commented 10 years ago

True, that's a change I can (and will) make. Thanks!

spacecowboy commented 10 years ago

Great!

CamDavidsonPilon commented 10 years ago

@spacecowboy, I added an API to change the labels:

    kmf = KaplanMeierFitter()
    ci_labels = ['upper', 'lower']
    kmf.fit(T, ci_labels=ci_labels)
    print kmf.confidence_interval_.columns
   #['upper', 'lower']

Run a

    pip install --upgrade --no-deps git+https://github.com/CamDavidsonPilon/lifelines.git

to pull master.