ACCLAB / DABEST-python

Data Analysis with Bootstrapped ESTimation
https://acclab.github.io/DABEST-python/
Apache License 2.0
341 stars 47 forks source link

Warning: Not all points displayed... #122

Closed brobr closed 3 years ago

brobr commented 3 years ago

Hi, thanks for dabest, looks very interesting, especially with the aim to show all data points. Running the example on a linux system with pandas-1.2.4, numpy-1.20.3, seaborn-0.11.1 and matplotlib-3.4.2 displayed the expected figure but with a couple of warnings:

Jupyter QtConsole 5.1.0 Python 3.9.6 (default, Jun 28 2021, 11:30:47) Type 'copyright', 'credits' or 'license' for more information IPython 7.22.0 -- An enhanced Interactive Python. Type '?' for help. In [1]: import pandas as pd In [2]: import dabest In [3]: iris = pd.read_csv("https://github.com/mwaskom/seaborn-data/raw/master/iris.csv") In [4]: iris_dabest = dabest.load(data=iris, x="species", y="petal_width", idx=("setosa", "versicolor", "virginica")) In [5]: iris_dabest.mean_diff.plot(); /usr/lib64/python3.9/site-packages/seaborn/categorical.py:1296: UserWarning: 38.0% of the points cannot be placed; you may want to decrease the size of the markers or use stripplot. warnings.warn(msg, UserWarning) /usr/lib64/python3.9/site-packages/seaborn/categorical.py:1296: UserWarning: 6.0% of the points cannot be placed; you may want to decrease the size of the markers or use stripplot. warnings.warn(msg, UserWarning) /usr/lib64/python3.9/site-packages/IPython/core/pylabtools.py:132: UserWarning: This figure includes Axes that are not compatible with tight_layout, so results might be incorrect. fig.canvas.print_figure(bytes_io, **kw)

If all points are supposed to be displayed these warnings seem quite unwanted.

Is there something in my set up that influences this? Or how can one correct for this?

Also, when running the example commands outwith a notebook-like context, say in a (i)python3 console, no figure is shown unless you run it as iris_dabest.mean_diff.plot().show() Maybe something to add/mention in the README ?

hth

Rob

josesho commented 3 years ago

HI @brobr,

This is a seaborn issue with how the swarm plot is generated; as the error indicates, if your Ns are large, you should reduce the size of the points with

your_dabest_object.mean_diff.plot(raw_marker_size=3);

Re: not notebook-like contexts, we will consider adding a short note in the documentation.

Best, Joses

brobr commented 3 years ago

Fab, thanks for the pointer Joses; had no clue that plot() would take arguments (need to update my seaborn knowledge and should have read the dabest tutorial to the end). How serious is the tight_layout axes warning? It can be bypassed by: iris_dabest.mean_diff.plot(raw_marker_size=2).set_tight_layout(False);

Note that the reduction of raw_marker_size needed to be greater (i.e. set to 2) to actually fit the whole of the second, long row of setosa data-points. Is it seaborn; matplotlib or dabest that keeps these points on one line and not divide them over, say, two rows? Can one fit this better by making a figure more wide (but where/how?).

Cheers,

Rob