CamDavidsonPilon / lifelines

Survival analysis in Python
lifelines.readthedocs.org
MIT License
2.32k stars 551 forks source link

Plotting KM Survival Functions not working #1593

Closed jcursons closed 5 months ago

jcursons commented 5 months ago

Hi,

I've recently updated lifelines and it has broken all of my previous plotting functions. Unfortunately I can't even seem to get your test code working, as per https://lifelines.readthedocs.io/en/latest/fitters/univariate/KaplanMeierFitter.html

waltons = load_waltons()
kmf = KaplanMeierFitter(label="waltons_data")
kmf.fit(waltons['T'], waltons['E'])
kmf.plot()

This gives the error:

Traceback (most recent call last):
  File "C:\python\venv\Lib\site-packages\IPython\core\interactiveshell.py", line 3526, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-3-b0199a766265>", line 3, in <module>
    kmf.plot()
  File "C:\python\venv\Lib\site-packages\lifelines\fitters\kaplan_meier_fitter.py", line 448, in plot
    return self.plot_survival_function(**kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\python\venv\Lib\site-packages\lifelines\fitters\kaplan_meier_fitter.py", line 453, in plot_survival_function
    return _plot_estimate(self, estimate="survival_function_", **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\python\venv\Lib\site-packages\lifelines\plotting.py", line 919, in _plot_estimate
    dataframe_slicer(plot_estimate_config.estimate_).rename(columns=lambda _: plot_estimate_config.kwargs.pop("label")).plot(
  File "C:\python\venv\Lib\site-packages\pandas\plotting\_core.py", line 975, in __call__
    return plot_backend.plot(data, kind=kind, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\python\venv\Lib\site-packages\pandas\plotting\_matplotlib\__init__.py", line 71, in plot
    plot_obj.generate()
  File "C:\python\venv\Lib\site-packages\pandas\plotting\_matplotlib\core.py", line 451, in generate
    self._adorn_subplots()
  File "C:\python\venv\Lib\site-packages\pandas\plotting\_matplotlib\core.py", line 676, in _adorn_subplots
    handle_shared_axes(
  File "C:\python\venv\Lib\site-packages\pandas\plotting\_matplotlib\tools.py", line 404, in handle_shared_axes
    layout[row_num(ax), col_num(ax)] = ax.get_visible()
           ^^^^^^^^^^^
  File "C:\python\venv\Lib\site-packages\pandas\plotting\_matplotlib\tools.py", line 393, in <lambda>
    row_num = lambda x: x.get_subplotspec().rowspan.start
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'rowspan'

I get a similar error when I try run my (previously working) functions on my own data.

The actual point of failure appears to be line 919 in plotting.py:

dataframe_slicer(plot_estimate_config.estimate_).rename(columns=lambda _: plot_estimate_config.kwargs.pop("label")).plot(
        logx=plot_estimate_config.logx, **plot_estimate_config.kwargs)

If I run up to this point and check some of the objects in memory everything appears to be in order:

dataframe_slicer(plot_estimate_config.estimate_)
Out[2]: 
          KM_estimate
timeline             
0.0          1.000000
6.0          1.000000
81.0         1.000000
111.0        1.000000
122.0        0.993103
...               ...
6699.0       0.216096
7514.0       0.162072
7563.0       0.162072
10346.0      0.081036
11252.0      0.081036
[145 rows x 1 columns]

plot_estimate_config.logx
Out[3]: False

plot_estimate_config.kwargs
Out[4]: 
{'ax': <Axes: >,
 'color': '#1f77b4',
 'drawstyle': 'steps-post',
 'label': 'KM_estimate'}

But it still falls over:

Traceback (most recent call last):
  File "C:\python\venv\Lib\site-packages\pandas\plotting\_matplotlib\tools.py", line 404, in handle_shared_axes
    layout[row_num(ax), col_num(ax)] = ax.get_visible()
           ^^^^^^^^^^^
  File "C:\python\venv\Lib\site-packages\pandas\plotting\_matplotlib\tools.py", line 393, in <lambda>
    row_num = lambda x: x.get_subplotspec().rowspan.start
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'rowspan'

Despite using python for a number of years I struggle to 'read' lambda functions in my head, so unfortunately I don't even know where to start with fixing this unless I start replacing large code chunks

jcursons commented 5 months ago

I've done a bit more debugging/testing and it seems that the issue arises due to my habit of plotting large multipanel figures.

If I run this, it works:

        waltons = load_waltons()

        handFig = plt.figure(figsize=(5,5))

        handAx= handFig.add_axes([0.1, 0.7, 0.8, 0.2])

        kmf = KaplanMeierFitter(label='waltons_data')
        kmf.fit(waltons['T'], waltons['E'])
        kmf.plot(ax=handAx)

        handFig.savefig(os.path.join(PathDir.pathOut, 'test.png'), dpi=300)
        plt.close(handFig)

But if I try to add a second axis to the figure, it crashes with the error listed above:


        waltons = load_waltons()

        handFig = plt.figure(figsize=(5,5))

        handAx= handFig.add_axes([0.1, 0.7, 0.8, 0.2])

        kmf = KaplanMeierFitter(label='waltons_data')
        kmf.fit(waltons['T'], waltons['E'])
        kmf.plot(ax=handAx)

        handAx2 = handFig.add_axes([0.1, 0.2, 0.8, 0.2])

        kmf2 = KaplanMeierFitter(label='waltons_data')
        kmf2.fit(waltons['T'], waltons['E'])
        kmf2.plot(ax=handAx2)

        handFig.savefig(os.path.join(PathDir.pathOut, 'test.png'), dpi=300)
        plt.close(handFig)

I suspect that using the built in subplot spec is a bit too smart for my approach of just dropping in axes at specified positions using handFig.add_axes() (even if I specify the axis handle through the ax input parameter)

CamDavidsonPilon commented 5 months ago

Hm, is it a pandas thing? We bumped min pandas from 1.0 to 1.2. What version of pandas have you been using?

MaxDanesi commented 5 months ago

Yeah, I'm having the same problem. This example shows no plots

from lifelines.statistics import survival_difference_at_fixed_point_in_time_test
from lifelines import KaplanMeierFitter
from lifelines.datasets import load_waltons

df = load_waltons()
ix = df['group'] == 'miR-137'
T_exp, E_exp = df.loc[ix, 'T'], df.loc[ix, 'E']
T_con, E_con = df.loc[~ix, 'T'], df.loc[~ix, 'E']

kmf_exp = KaplanMeierFitter(label="exp").fit(T_exp, E_exp)
kmf_con = KaplanMeierFitter(label="con").fit(T_con, E_con)

point_in_time = 10.
results = survival_difference_at_fixed_point_in_time_test(point_in_time, kmf_exp, kmf_con)
results.print_summary()

kmf_exp.plot_survival_function(point_in_time=point_in_time)
kmf_con.plot_survival_function(point_in_time=point_in_time)
CamDavidsonPilon commented 5 months ago

I can copy-paste that snippet into ipython, but I also need to add plt.show for it to appear.

jcursons commented 5 months ago

Apologies for the delayed response, but great suggestion thanks @CamDavidsonPilon - updating pandas from 2.0.1 to 2.2.0 has fixed the issue that I was having!

MaxDanesi commented 5 months ago

yup, all working here too. Thanks for the quick response.