trevismd / statannotations

add statistical significance annotations on seaborn plots. Further development of statannot, with bugfixes, new features, and a different API.
Other
624 stars 67 forks source link

Stat layout is not align with x ticks for boxplot plotted with panda dataframe #41

Open mariesoret opened 2 years ago

mariesoret commented 2 years ago

Hi, i'm using your very helpful package to annotate datasets that i store and plot using pandas data frame methods. When adding stat annotation to data frame boxplot layout, the lines presents an offset on x. To illustrate my problem i plotted the data first using the data frame method and then using seaborn library.

import seaborn as sns
from statannotations.Annotator` import Annotator

fig, axs = plt.subplots(1,2)

axs[0] = average_annual_adult_go.boxplot(ax=axs[0], grid=False)
annotator0 = Annotator(ax=axs[0], data=average_annual_adult_go, pairs=[("Initial state","Working phase")], order=["Initial state", "Working phase"])
annotator0.configure(test='t-test_welch', text_format='star', loc='outside')
annotator0.apply_and_annotate()

axs[1] = sns.boxplot(data=average_annual_adult_go)
annotator1 = Annotator(ax=axs[1], data=average_annual_adult_go, pairs=[("Initial state","Working phase")], order=["Initial state", "Working phase"])
annotator1.configure(test='t-test_welch', text_format='star', loc='outside')
annotator1.apply_and_annotate()

the result :

p-value annotation legend: ns: p <= 1.00e+00 : 1.00e-02 < p <= 5.00e-02 : 1.00e-03 < p <= 1.00e-02 : 1.00e-04 < p <= 1.00e-03 ****: p <= 1.00e-04

Initial state vs. Working phase: Welch's t-test independent samples, P_val:5.684e-272 t=3.485e+02 p-value annotation legend: ns: p <= 1.00e+00 : 1.00e-02 < p <= 5.00e-02 : 1.00e-03 < p <= 1.00e-02 : 1.00e-04 < p <= 1.00e-03 ****: p <= 1.00e-04

Initial state vs. Working phase: Welch's t-test independent samples, P_val:5.684e-272 t=3.485e+02

statannotations_openissues

trevismd commented 2 years ago

Hi, thanks for stopping by! statannotations is built to work with seaborn, I didn't expect it to "almost" work with pandas out of the box. This could also simply be because you chose the first two groups only, I don't know. This is an interesting idea. It might be possible to create a Plotter instance to cover pd.DataFrame.boxplot, it's worth investigating if people are interested.

mariesoret commented 2 years ago

Hi, thanks for your answer!

just for info in case you follow up: i tired with all possible combination of datasets ("initial state", "working phase"), ("initial state", "offset phase") and ("working phase", "offset phase") and observed the same offset on x of one tick to the left.