ACCLAB / DABEST-python

Data Analysis with Bootstrapped ESTimation
https://acclab.github.io/DABEST-python/
Apache License 2.0
337 stars 46 forks source link

v2024.03.29 not working with listed dependencies and README basic usage example #180

Closed mlotinga closed 1 month ago

mlotinga commented 1 month ago

Describe the bug

The current release version 2024.03.29 raises multiple errors running the basic usage README example when installed with dependencies listed in https://github.com/ACCLAB/DABEST-python/blob/master/nbs/01-getting_started.ipynb

Perhaps that dependencies list is out of date?

To Reproduce

  1. Create venv environment: python==3.10.0 numpy==1.23.5 scipy==1.9.3 matplotlib==3.6.3 pandas==1.5.0 seaborn==0.12.2 lqrt==0.3.3 dabest jupyter jupyterlab

  2. Run

import pandas as pd
import dabest

# Load the iris dataset. This step requires internet access.
iris = pd.read_csv("https://github.com/mwaskom/seaborn-data/raw/master/iris.csv")

# Load the above data into `dabest`.
iris_dabest = dabest.load(data=iris, x="species", y="petal_width",
                          idx=("setosa", "versicolor", "virginica"))

# Produce a Cumming estimation plot.
iris_dabest.mean_diff.plot()

raises errors:

\lib\site-packages\dabest\_dabest_object.py:668: FutureWarning: In a future version, `df.iloc[:, i] = newvals` will attempt to set the values inplace instead of always setting a new array. To retain the old behavior, use either `df[df.columns[i]] = newvals` or, if columns are non-unique, `df.isetitem(i, newvals)`
  plot_data.loc[:, self.__xvar] = pd.Categorical(
\lib\site-packages\dabest\plot_tools.py:1232: UserWarning: 68.0% of the points cannot be placed. You might want to decrease the size of the markers.
  warnings.warn(err)
\lib\site-packages\dabest\plot_tools.py:1232: UserWarning: 50.0% of the points cannot be placed. You might want to decrease the size of the markers.
  warnings.warn(err)
\lib\site-packages\dabest\plot_tools.py:1232: UserWarning: 36.0% of the points cannot be placed. You might want to decrease the size of the markers.
  warnings.warn(err)

and produces truncated data plot

Expected behavior

Output visualisation shown in https://github.com/ACCLAB/DABEST-python/blob/master/iris.png

image

Screenshots

image

Your package version (please complete the following information):

Additional context

pip list

Package                           Version
--------------------------------- --------------
anyio                             4.4.0
argon2-cffi                       23.1.0
argon2-cffi-bindings              21.2.0
arrow                             1.3.0
asttokens                         2.4.1
async-lru                         2.0.4
attrs                             23.2.0
Babel                             2.15.0
beautifulsoup4                    4.12.3
bleach                            6.1.0
certifi                           2024.6.2
cffi                              1.16.0
charset-normalizer                3.3.2
colorama                          0.4.6
comm                              0.2.2
contourpy                         1.2.1
cycler                            0.12.1
dabest                            2024.3.29
DateTime                          5.5
debugpy                           1.8.1
decorator                         5.1.1
defusedxml                        0.7.1
exceptiongroup                    1.2.1
executing                         2.0.1
fastcore                          1.5.44
fastjsonschema                    2.19.1
fonttools                         4.53.0
fqdn                              1.5.1
h11                               0.14.0
httpcore                          1.0.5
httpx                             0.27.0
idna                              3.7
ipykernel                         6.29.4
ipython                           8.25.0
ipywidgets                        8.1.3
isoduration                       20.11.0
jedi                              0.19.1
Jinja2                            3.1.4
json5                             0.9.25
jsonpointer                       2.4
jsonschema                        4.22.0
jsonschema-specifications         2023.12.1
jupyter                           1.0.0
jupyter_client                    8.6.2
jupyter-console                   6.6.3
jupyter_core                      5.7.2
jupyter-events                    0.10.0
jupyter-lsp                       2.2.5
jupyter_server                    2.14.1
jupyter_server_terminals          0.5.3
jupyterlab                        4.2.1
jupyterlab_pygments               0.3.0
jupyterlab_server                 2.27.2
jupyterlab_widgets                3.0.11
kiwisolver                        1.4.5
lckr_jupyterlab_variableinspector 3.2.1
lqrt                              0.3.3
MarkupSafe                        2.1.5
matplotlib                        3.6.3
matplotlib-inline                 0.1.7
mistune                           3.0.2
nbclient                          0.10.0
nbconvert                         7.16.4
nbformat                          5.10.4
nest-asyncio                      1.6.0
notebook                          7.2.1
notebook_shim                     0.2.4
numpy                             1.23.5
overrides                         7.7.0
packaging                         24.0
pandas                            1.5.0
pandocfilters                     1.5.1
parso                             0.8.4
patsy                             0.5.6
pillow                            10.3.0
pip                               21.2.3
platformdirs                      4.2.2
prometheus_client                 0.20.0
prompt_toolkit                    3.0.46
psutil                            5.9.8
pure-eval                         0.2.2
pycparser                         2.22
Pygments                          2.18.0
pyparsing                         3.1.2
python-dateutil                   2.9.0.post0
python-json-logger                2.0.7
pytz                              2024.1
pywin32                           306
pywinpty                          2.0.13
PyYAML                            6.0.1
pyzmq                             26.0.3
qtconsole                         5.5.2
QtPy                              2.4.1
referencing                       0.35.1
requests                          2.32.3
rfc3339-validator                 0.1.4
rfc3986-validator                 0.1.1
rpds-py                           0.18.1
scipy                             1.9.3
seaborn                           0.12.2
Send2Trash                        1.8.3
setuptools                        57.4.0
six                               1.16.0
sniffio                           1.3.1
soupsieve                         2.5
stack-data                        0.6.3
statsmodels                       0.14.2
terminado                         0.18.1
tinycss2                          1.3.0
tomli                             2.0.1
tornado                           6.4.1
traitlets                         5.14.3
types-python-dateutil             2.9.0.20240316
typing_extensions                 4.12.1
uri-template                      1.3.0
urllib3                           2.2.1
wcwidth                           0.2.13
webcolors                         24.6.0
webencodings                      0.5.1
websocket-client                  1.8.0
widgetsnbextension                4.0.11
zope.interface                    6.4.post2
mlotinga commented 1 month ago

Upgrading to pandas==1.5.3 does nothing - same errors raised.

mlotinga commented 1 month ago

Upgrading to pandas==2.0 does something, but raises new errors:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[1], line 12
      8 iris_dabest = dabest.load(data=iris, x="species", y="petal_width",
      9                           idx=("setosa", "versicolor", "virginica"))
     11 # Produce a Cumming estimation plot.
---> 12 iris_dabest.mean_diff.plot()

File [~\lib\site-packages\dabest\_effsize_objects.py:1195, in EffectSizeDataFrame.plot(self, color_col, raw_marker_size, es_marker_size, swarm_label, contrast_label, delta2_label, swarm_ylim, contrast_ylim, delta2_ylim, swarm_side, custom_palette, swarm_desat, halfviolin_desat, halfviolin_alpha, face_color, bar_label, bar_desat, bar_width, bar_ylim, ci, ci_type, err_color, float_contrast, show_pairs, show_delta2, show_mini_meta, group_summaries, group_summaries_offset, fig_size, dpi, ax, contrast_show_es, es_sf, es_fontsize, contrast_show_deltas, gridkey_rows, gridkey_merge_pairs, gridkey_show_Ns, gridkey_show_es, swarmplot_kwargs, barplot_kwargs, violinplot_kwargs, slopegraph_kwargs, sankey_kwargs, reflines_kwargs, group_summary_kwargs, legend_kwargs, title, fontsize_title, fontsize_rawxlabel, fontsize_rawylabel, fontsize_contrastxlabel, fontsize_contrastylabel, fontsize_delta2label)
   1192 all_kwargs = locals()
   1193 del all_kwargs["self"]
-> 1195 out = effectsize_df_plotter(self, **all_kwargs)
   1197 return out

File [~\lib\site-packages\dabest\plotter.py:302, in effectsize_df_plotter(effectsize_df, **plot_kwargs)
    300 if custom_pal is None and color_col is None:
    301     swarm_colors = [sns.desaturate(c, swarm_desat) for c in unsat_colors]
--> 302     plot_palette_raw = dict(zip(names.categories, swarm_colors))
    304     bar_color = [sns.desaturate(c, bar_desat) for c in unsat_colors]
    305     plot_palette_bar = dict(zip(names.categories, bar_color))

AttributeError: 'numpy.ndarray' object has no attribute 'categories'
mlotinga commented 1 month ago

Upgrading numpy==1.26.4 seaborn==0.13.0 scipy==1.10.0 (required for intra-compatability) does nothing to address the above numpy.ndarray AttributeError

mlotinga commented 1 month ago

This seems to be specific to setting up an environment using venv and pip: using conda to set up the environment using the quoted dependency versions makes it work ok.

Jacobluke- commented 4 weeks ago

Hi @mlotinga , thanks for raising this issue. The "truncated" plot you generated is the new feature we introduced in the latest version of dabest, which is the asymmetric swarm plot.

We are glad that the problem got resolved.