trevismd / statannotations

add statistical significance annotations on seaborn plots. Further development of statannot, with bugfixes, new features, and a different API.
Other
624 stars 67 forks source link

Annotating boxplot with broken y axis #47

Open ericabello opened 2 years ago

ericabello commented 2 years ago

Hello, thanks for the package it's great! I am trying to annotate a boxplot where I want to break the y axis into two but I cannot see the annotation of the data in the bottom part (ax2 subplot) Here is the code:

from statannotations.Annotator import Annotator df_samples_media = df_samples.loc[(df_samples["Sample"].str.contains("LPS"))] assays_list = (df_samples_media["Assay"].unique()) pairs2=[[("CCL8", "WT"),("CCL8", "HOM")] , [("CCL13", "WT"), ("CCL13", "HOM")]] pairs=[[("CCL2", "WT"),("CCL2", "HOM")]] hue_parameters= {'data': df_samples_media, 'x': 'Assay', 'y': 'Conc', "hue": "PTK2B_genotype", } fig, (ax, ax2) = plt.subplots(ncols=1, nrows=2, sharex=True) ax=sns.boxplot(hue_parameters, ax=ax) annotator = Annotator(ax, pairs, hue_parameters) annotator.configure(test='t-test_ind', text_format='star', loc="inside") annotator.apply_and_annotate() ax2=sns.boxplot(hue_parameters, ax=ax2) annotator2 = Annotator(ax2, pairs2, hue_parameters) annotator2.configure(test='t-test_ind', text_format='star',loc="inside") annotator2.apply_and_annotate() ax.set_ylim(500000, 2000000) # outliers only ax2.set_ylim(0, 90000) ax2.get_legend().remove() ax.spines['bottom'].set_visible(False) ax2.spines['top'].set_visible(False) ax.xaxis.tick_top() ax.tick_params(labeltop=False) # don't put tick labels at the top ax2.xaxis.tick_bottom() ax.set_ylabel("") ax.set_xlabel("") ax2.set_ylabel("")

adjust distance between subplots

fig.subplots_adjust(hspace=0.5) fig.text(0.07, 0.5, 'Concentration (pg/ml)', va='center', rotation='vertical')

do you know how I could do that? Is there a way to control the distance of the annotations to the boxplots? Thanks a lot, Erica plate3_LPS_stats a

trevismd commented 2 years ago

Hello Erica, Thanks for giving a shot and going the extra mile to make this work ;) Also thank you because I think there is an improvement to make to statannotations, revealed by your intervention.

What you want is almost entirely possible now. Here are the steps involved.

So you'd get

import matplotlib.pyplot as plt
import pandas as pd
from statannotations.Annotator import Annotator
import seaborn as sns

df = pd.DataFrame([
    {'PTK2B_genotype': 'WT', 'Assay': 'CCL8', 'Conc': 75_000},
    {'PTK2B_genotype': 'HOM', 'Assay': 'CCL8', 'Conc': 59_000},
    {'PTK2B_genotype': 'WT', 'Assay': 'CCL2', 'Conc': 1_450_000},
    {'PTK2B_genotype': 'HOM', 'Assay': 'CCL2', 'Conc': 900_000},
    {'PTK2B_genotype': 'WT', 'Assay': 'CCL13', 'Conc': 10_000},
    {'PTK2B_genotype': 'HOM', 'Assay': 'CCL13', 'Conc': 11_000},
    {'PTK2B_genotype': 'WT', 'Assay': 'CCL8', 'Conc': 81_000},
    {'PTK2B_genotype': 'HOM', 'Assay': 'CCL8', 'Conc': 41_000},
    {'PTK2B_genotype': 'WT', 'Assay': 'CCL2', 'Conc': 1_560_000},
    {'PTK2B_genotype': 'HOM', 'Assay': 'CCL2', 'Conc': 800_000},
    {'PTK2B_genotype': 'WT', 'Assay': 'CCL13', 'Conc': 12_000},
    {'PTK2B_genotype': 'HOM', 'Assay': 'CCL13', 'Conc': 11_000},
    {'PTK2B_genotype': 'WT', 'Assay': 'CCL8', 'Conc': 66_000},
    {'PTK2B_genotype': 'HOM', 'Assay': 'CCL8', 'Conc': 63_000},
    {'PTK2B_genotype': 'WT', 'Assay': 'CCL2', 'Conc': 1_300_000},
    {'PTK2B_genotype': 'HOM', 'Assay': 'CCL2', 'Conc': 920_000},
    {'PTK2B_genotype': 'WT', 'Assay': 'CCL13', 'Conc': 12_000},
    {'PTK2B_genotype': 'HOM', 'Assay': 'CCL13', 'Conc': 12_500},
    {'PTK2B_genotype': 'WT', 'Assay': 'CCL8', 'Conc': 64_000},
    {'PTK2B_genotype': 'HOM', 'Assay': 'CCL8', 'Conc': 57_000},
    {'PTK2B_genotype': 'WT', 'Assay': 'CCL2', 'Conc': 1_400_000},
    {'PTK2B_genotype': 'HOM', 'Assay': 'CCL2', 'Conc': 750_000},
    {'PTK2B_genotype': 'WT', 'Assay': 'CCL13', 'Conc': 13_000},
    {'PTK2B_genotype': 'HOM', 'Assay': 'CCL13', 'Conc': 13_000},
])

assays = pd.CategoricalDtype(categories=['CCL8', 'CCL2', 'CCL13'],
                             ordered=True)
df.Assay = df.Assay.astype(assays)

df1 = df.loc[df.Assay == 'CCL2', :]
df2 = df.loc[df.Assay.isin(['CCL8', 'CCL13']), :]

pairs1 = [[("CCL2", "WT"), ("CCL2", "HOM")]]
pairs2 = [[("CCL8", "WT"), ("CCL8", "HOM")], [("CCL13", "WT"), ("CCL13", "HOM")]]

hue_parameters = {
    'x': 'Assay',
    'y': 'Conc',
    'hue': 'PTK2B_genotype',
}

hue_parameters1 = {**hue_parameters, 'data': df1}
hue_parameters2 = {**hue_parameters, 'data': df2}

fig, (ax1, ax2) = plt.subplots(ncols=1, nrows=2, sharex=True)

ax1 = sns.boxplot(ax=ax1, **hue_parameters1)
ax2 = sns.boxplot(ax=ax2, **hue_parameters2)

ax1.set_ylim(500000, 2000000)  # outliers only
ax2.set_ylim(0, 90000)

ax2.get_legend().remove()
ax1.spines['bottom'].set_visible(False)
ax2.spines['top'].set_visible(False)

ax1.xaxis.tick_top()

ax1.tick_params(labeltop=False)  # don't put tick labels at the top
ax2.xaxis.tick_bottom()

ax1.set_ylabel("")
ax1.set_xlabel("")
ax2.set_ylabel("")

# adjust distance between subplots
fig.subplots_adjust(hspace=0.5)
fig.text(0.01, 0.5, 'Concentration (pg/ml)', va='center', rotation='vertical')

annotator = Annotator(ax=ax1, pairs=pairs1, **hue_parameters1)
annotator.configure(test='t-test_ind', text_format='star', loc="inside")
annotator.apply_and_annotate()

annotator2 = Annotator(ax=ax2, pairs=pairs2, **hue_parameters2)
annotator2.configure(test='t-test_ind', text_format='star', loc="inside")
annotator2.apply_and_annotate()

fig.show()

for a graph like this image

There are parameters to change the annotations y offset, but I don't think it was the first thing to try here. However, you can look into them if you'd like more/less spacing now.

Hope that helps!

trevismd commented 2 years ago

I'll close this then, @ericabello, but don't hesitate to come back if something is still unclear.

Also, we haven't got examples of such plot design in the gallery, so please reach out if you'd like to contribute one of your graphs to the package documentation ☺

ericabello commented 2 years ago

Hello Florian, I am so sorry but I haven’t had time to have another look at this. I am struggling to make your example work with my df but I need to have another look to make sure Can I be in touch about it later today or tomorrow please? Thanks for implementing the changes and I’d be happy to provide my graph for you to include in your docs (when I work it out). all the best Erica

From: Florian Charlier @.> Reply-To: trevismd/statannotations @.> Date: Tuesday, 15 February 2022 at 20:16 To: trevismd/statannotations @.> Cc: Erica Bello @.>, Mention @.***> Subject: Re: [trevismd/statannotations] Annotating boxplot with broken y axis (Issue #47) [EXT]

I'll close this then, @ericabello [github.com]https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_ericabello&d=DwMFaQ&c=D7ByGjS34AllFgecYw0iC6Zq7qlm8uclZFI0SqQnqBo&r=2o5g-VofO4mI0aXA5G3ofrQHIck-HvA8GPWqu1VaPxo&m=gUQvA9ZreLjxxFQTNVdVXzzRjSFyplSIqZ-fHvwXLB1IFIFMh2c5ZXY97D4Gqu4Q&s=TNHwwWSH8rFYE4HcFON-tX962cY_m0Z9F-3R9B3hqpI&e=, but don't hesitate to come back if something is still unclear.

Also, we haven't got examples of such plot design in the gallery, so please reach out if you'd like to contribute one of your graphs to the package documentation ☺

— Reply to this email directly, view it on GitHub [github.com]https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_trevismd_statannotations_issues_47-23issuecomment-2D1040749279&d=DwMFaQ&c=D7ByGjS34AllFgecYw0iC6Zq7qlm8uclZFI0SqQnqBo&r=2o5g-VofO4mI0aXA5G3ofrQHIck-HvA8GPWqu1VaPxo&m=gUQvA9ZreLjxxFQTNVdVXzzRjSFyplSIqZ-fHvwXLB1IFIFMh2c5ZXY97D4Gqu4Q&s=OyWWpMvYpVNModRksGQSoXOjQmsgyalaS31sZ29r7as&e=, or unsubscribe [github.com]https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AGJJ63NJGBYZ75O3BVWOAUTU3KYDFANCNFSM5NMX374Q&d=DwMFaQ&c=D7ByGjS34AllFgecYw0iC6Zq7qlm8uclZFI0SqQnqBo&r=2o5g-VofO4mI0aXA5G3ofrQHIck-HvA8GPWqu1VaPxo&m=gUQvA9ZreLjxxFQTNVdVXzzRjSFyplSIqZ-fHvwXLB1IFIFMh2c5ZXY97D4Gqu4Q&s=KptCek8u5noZhDiQltOi6HeCk0WaM07_HJRMk6xAMf4&e=. Triage notifications on the go with GitHub Mobile for iOS [apps.apple.com]https://urldefense.proofpoint.com/v2/url?u=https-3A__apps.apple.com_app_apple-2Dstore_id1477376905-3Fct-3Dnotification-2Demail-26mt-3D8-26pt-3D524675&d=DwMFaQ&c=D7ByGjS34AllFgecYw0iC6Zq7qlm8uclZFI0SqQnqBo&r=2o5g-VofO4mI0aXA5G3ofrQHIck-HvA8GPWqu1VaPxo&m=gUQvA9ZreLjxxFQTNVdVXzzRjSFyplSIqZ-fHvwXLB1IFIFMh2c5ZXY97D4Gqu4Q&s=iPELVWpPibk-tCoCotPPPgoj7bwdZpi0ZPYQAyJ6XgI&e= or Android [play.google.com]https://urldefense.proofpoint.com/v2/url?u=https-3A__play.google.com_store_apps_details-3Fid-3Dcom.github.android-26referrer-3Dutm-5Fcampaign-253Dnotification-2Demail-2526utm-5Fmedium-253Demail-2526utm-5Fsource-253Dgithub&d=DwMFaQ&c=D7ByGjS34AllFgecYw0iC6Zq7qlm8uclZFI0SqQnqBo&r=2o5g-VofO4mI0aXA5G3ofrQHIck-HvA8GPWqu1VaPxo&m=gUQvA9ZreLjxxFQTNVdVXzzRjSFyplSIqZ-fHvwXLB1IFIFMh2c5ZXY97D4Gqu4Q&s=mj8EKSphzOgEQbth3jX9OtrR72hz64a1RaNO23rAFGg&e=. You are receiving this because you were mentioned.Message ID: @.***>

-- The Wellcome Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE.

ericabello commented 2 years ago

Hello Florian, I finally got around to looking at this and I’ve implemented your approach of splitting the df into 2 and it works well with the annotations thanks! I only get this warning which I am not sure I should worry about: “/Users/eb19/.pyenv/versions/ptk2b/lib/python3.8/site-packages/statannotations/_Plotter.py:270: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison hue_mask = self.plotter.plot_hues[index] == hue_level”

here is my code:

top_assays = df_samples_media.loc[df_samples_media["Conc"] > 200000 ] bottom_assays = df_samples_media.loc[df_samples_media["Conc"] < 20000 ]

assays_list_top = (top_assays["Assay"].unique()) assays_list_bottom = (bottom_assays["Assay"].unique())

pairs_top=[[(ccl, "WT"), (ccl, "HOM")] for ccl in assays_list_top] pairs_bottom=[[(ccl, "WT"), (ccl, "HOM")] for ccl in assays_list_bottom]

hue_parameters = { 'x': 'Assay', 'y': 'Conc', 'hue': 'PTK2B_genotype', } hue_parameters_top = {hue_parameters, 'data': top_assays} hue_parameters_bottom = {hue_parameters, 'data': bottom_assays}

fig, (ax1, ax2) = plt.subplots(ncols=1, nrows=2, sharex=True)

ax1 = sns.boxplot(ax=ax1, hue_parameters_top) ax2 = sns.boxplot(ax=ax2, hue_parameters_bottom)

ax1.set_ylim(200000, 1000000) # outliers only ax2.set_ylim(0, 20000)

ax2.get_legend().remove() ax1.spines['bottom'].set_visible(False) ax2.spines['top'].set_visible(False)

ax1.xaxis.tick_top()

ax1.tick_params(labeltop=False) # don't put tick labels at the top ax2.xaxis.tick_bottom()

ax1.set_ylabel("") ax1.set_xlabel("") ax2.set_ylabel("") from matplotlib.ticker import FuncFormatter def scientific(x, pos):

x: tick value

# pos: tick position
return '%.2e' % x

scientific_formatter = FuncFormatter(scientific) ax2.yaxis.set_major_formatter(scientific_formatter) ax1.yaxis.set_major_formatter(scientific_formatter)

adjust distance between subplots

fig.subplots_adjust(hspace=0.2) fig.text(0.05, 0.5, 'Concentration (pg/ml)', va='center', rotation='vertical')

annotator = Annotator(ax=ax1, pairs=pairs_top, **hue_parameters_top) annotator.configure(test='t-test_ind', text_format='star', loc="inside") annotator.apply_and_annotate()

annotator2 = Annotator(ax=ax2, pairs=pairs_bottom, **hue_parameters_bottom) annotator2.configure(test='t-test_ind', text_format='star', loc="inside") annotator2.apply_and_annotate() plt.show()

Attached is my graph. Please feel free to include it in your tutorial if you think it’s useful. Thanks again for the help, the package is great! Erica plate3_IFNG_stats_broken_axis

Reopened #47 [github.com]https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_trevismd_statannotations_issues_47&d=DwMCaQ&c=D7ByGjS34AllFgecYw0iC6Zq7qlm8uclZFI0SqQnqBo&r=2o5g-VofO4mI0aXA5G3ofrQHIck-HvA8GPWqu1VaPxo&m=QBmte5wkp6R56H94bLn6TToicSKJ-_KzRWzwga3DZlSjyp2I6PkywU9jq5BI0Fwj&s=QOPumaT5Qn3ZCavZDmfayila6G9Uxkcrr-gU7A5Z0VU&e=.

trevismd commented 2 years ago

Hello Erica @ericabello,

Thanks for following up. I'm glad you made it successfully! You can safely ignore this warning (see here for details on it and how to silence it if necessary); I may also adapt the code to remove it at some point.

About the final result, it didn't go through via email. If you're still up for sharing it, could you please upload it on the Github issue directly?

ericabello commented 2 years ago

Done! Erica