CamDavidsonPilon / lifelines

Survival analysis in Python
lifelines.readthedocs.org
MIT License
2.36k stars 558 forks source link

add_at_risk_counts breaks in subplots #562

Open vkav opened 5 years ago

vkav commented 5 years ago

Hi,

Had some issues with the add_at_risk_counts option when plotting KM curves.

  1. Is there a way to have the numbers align with the xtick labels?

  2. When I try using it in subplots, its has a weird behavior; A horizontal line appears between it and the x-axis (which is not there when plotting a single axes) plus the xlabel is buried underneath it. I tried playing with subplots_adjust to move the at_risk box away from the actual plot but I have to use extreme values and the plot looks then looks bad. Also, ytick labels become messed up, for example they are not visible for some of the subplots. Attached an example figure plus code.

Any help would be greatly appreciated.

Thanks!

Code to make the figure

for i, (vol,p) in enumerate(zip(vols,p)):

    T = df["OS time1"]
    E = df["OS status"]

    df["Var split"] = (df["Var"] > vol).astype(int)

    ax = fig.add_subplot(5,3,i+1)

    level = (df["Var split"] == 1)
    f1 = KaplanMeierFitter()
    f1.fit(T[level], event_observed=E[level], label=">%s cm3" % vol)
    f1.plot(ax=ax, ci_show=False, show_censors=True, censor_styles={'ms': 5})
    f2 = KaplanMeierFitter()
    f2.fit(T[~level], event_observed=E[~level], label="<=%s cm3" % vol)
    f2.plot(ax=ax, ci_show=False, show_censors=True, censor_styles={'ms': 5})

    plt.title("<%s cm$^3$ vs >=%s cm$^3$" % (vol,vol), y=1.03)
    plt.xlabel("OS in years")
    plt.ylabel("Survival %")
    plt.legend(loc="lower left")
    plt.xlim(left=-0.5)
    plt.ylim(0, 1.02)

    sns.despine()

    add_at_risk_counts(f1, f2, ax=ax, fig=fig)
    ax.add_artist(AnchoredText("p = %.3f" % p, loc=4, frameon=False))

plt.subplots_adjust(top=1.5, hspace=0.4)
plt.show()

example

CamDavidsonPilon commented 5 years ago
  1. (untested) you could modify the plt.xticks in each plot - but it's not obvious how to do that.
  2. hm, add_at_risk_counts is a bit fragile and underdeveloped, so I'm not surprised it fails with many subplots. This could be a problem though: https://github.com/CamDavidsonPilon/lifelines/blob/master/lifelines/plotting.py#L167

One suggestion, if you comfortable with it, is to copy-paste the function into your local script (and any other necessary functions like is_latex_enabled), and play around with the function locally.

If you happen to get something that works, I'd love to see the solution!

vkav commented 5 years ago

Thanks for the feedback.

Playing with the function itself was the next step, thought to check first if there was an obvious solution. Agree, tight_layout may be contributing, also the sns.despine I use might be at fault.

Will update if I get it to work, thanks again!

vkav commented 5 years ago

An update;

By removing plt.tight_layout() and adjusting the values in ax2_ypos = -0.15 * 6.0 / fig.get_figheight(), I could get it to work. Still some of the yticks labels in the inner subplots are missing, but I think I've exhausted my matplotlib skills. One could do sharey=True though and get around it.

Cheers