ACCLAB / DABEST-python

Data Analysis with Bootstrapped ESTimation
https://acclab.github.io/DABEST-python/
Apache License 2.0
341 stars 47 forks source link

Is there a way to change the panel size of one of the group? #97

Closed yngmiin closed 4 years ago

yngmiin commented 4 years ago

Hi,

Thank you for the excellent package. I am analysing survey data with many data points loaded onto one response value, is there a way I can enlarge panel of group "A" in the attached swarm plot to accurately show all the data points?

Here is an example data, I am new to dabest and python so I apologize if the question has been asked before.

`import numpy as np import pandas as pd import dabest import random

Create an example dataframe

x = np.arange(1,6) xd = np.repeat(x, [70,80,100,300,150]) f = np.array(["A", "B", "C"], dtype = np.str) fd = np.repeat(f, [510,80,110])

random.seed(123) np.random.shuffle(fd)

xf = np.vstack((xd,fd)) eg = pd.DataFrame(data = xf, index = ["Resp", "Type"]) eg = eg.T eg['Resp'] = eg['Resp'].astype('float') egdf = dabest.load(eg, idx=("A", "B", "C"), x="Type", y="Resp") egdfclf = egdf.cliffs_delta egplt = egdf.cliffs_delta.plot(raw_marker_size = 2, fig_size = [12, 6]) ` egplt

josesho commented 4 years ago

Hi @yngmiin ,

The central problem arises for two reasons:

  1. Your data is ordinal. (You are correct in using Cliff's delta as the effect size because of this.)
  2. You have a fairly large N in group A.

It is difficult to optimize a suitable swarmplot by playing around with raw_marker_size and fig_size, as your example implies.

I would suggest either replacing the top swarmplot panel with either a boxenplot or a violinplot.

import seaborn as sns 
import matplotlib.pyplot as plt

# Creates a plot with 3 axes. 
# The last axes is 1.5X the height of the rest, as the full estimation plot goes there.
# See more here:
# https://matplotlib.org/3.2.1/api/_as_gen/matplotlib.gridspec.GridSpec.html#matplotlib.gridspec.GridSpec
f, ax = plt.subplots(nrows=3, figsize=[6, 16],
                    gridspec_kw = {'height_ratios': [0.5, 0.5, 2]})

sns.boxenplot(data=eg, x="Type", y="Resp", ax=ax[0])

sns.violinplot(data=eg, x="Type", y="Resp", ax=ax[1])

egdfclf.plot(raw_marker_size=1, ax=ax[2]);

Screenshot 2020-04-16 at 00 11 15

The final step of combining either the boxenplot or the violinplot with the effect size axes, sadly, has to be done in a vector graphics editing tool. Hope this helped!

yngmiin commented 4 years ago

Hi Joses,

Thank you for your valuable suggestion. Have a nice day.

Regards, Yng Miin