scverse / pertpy

Perturbation Analysis in the scverse ecosystem.
https://pertpy.readthedocs.io/en/latest/
MIT License
126 stars 20 forks source link

Milo in Pertpy: annotate_nhoods issue #323

Closed MaximilianNuber closed 1 year ago

MaximilianNuber commented 1 year ago

Report

Hi everyone,

I encountered an issue with the function "annotate_nhoods" in pertpy.tl.Milo(). The issue occurred both in my own workflow (which I can´t post unfortunately), but I was able to recreate with the example from https://www.sc-best-practices.org/conditions/compositional.html#without-labeled-clusters, in the section "without labeled clusters". I copy pasted the workflow from there, and when using "milo.annotate_nhoods(mdata, anno_col="cell_label")", I got the following error.

ValueError Traceback (most recent call last) Cell In[27], line 1 ----> 1 milo.annotate_nhoods(mdata, anno_col="cell_label") 2 # Define as mixed if fraction of cells in nhood with same label is lower than 0.75 4 mdata["milo"].var.loc[ 5 mdata["milo"].var["nhood_annotation_frac"] < 0.75, "nhood_annotation" 6 ] = "Mixed"

File ~/.local/lib/python3.10/site-packages/pertpy/tools/_milo.py:400, in Milo.annotate_nhoods(self, mdata, anno_col, feature_key) 397 anno_count = adata.obsm["nhoods"].T.dot(csr_matrix(anno_dummies.values)) 398 anno_frac = np.array(anno_count / anno_count.sum(1)) --> 400 anno_frac_dataframe = pd.DataFrame(anno_frac, columns=anno_dummies.columns, index=sample_adata.var_names) 401 sample_adata.varm["frac_annotation"] = anno_frac_dataframe.values 402 sample_adata.uns["annotation_labels"] = anno_frac_dataframe.columns

File ~/.local/lib/python3.10/site-packages/pandas/core/frame.py:758, in DataFrame.init(self, data, index, columns, dtype, copy) 747 mgr = dict_to_mgr( 748 # error: Item "ndarray" of "Union[ndarray, Series, Index]" has no 749 # attribute "name" (...) 755 copy=_copy, 756 ) 757 else: --> 758 mgr = ndarray_to_mgr( 759 data, 760 index, 761 columns, 762 dtype=dtype, 763 copy=copy, 764 typ=manager, 765 ) 767 # For data is list-like, or Iterable (will consume into list) 768 elif is_list_like(data):

File ~/.local/lib/python3.10/site-packages/pandas/core/internals/construction.py:315, in ndarray_to_mgr(values, index, columns, dtype, copy, typ) 309 _copy = ( 310 copy_on_sanitize 311 if (dtype is None or astype_is_view(values.dtype, dtype)) 312 else False 313 ) 314 values = np.array(values, copy=_copy) --> 315 values = _ensure_2d(values) 317 else: 318 # by definition an array here 319 # the dtypes will be coerced to a single dtype 320 values = _prep_ndarraylike(values, copy=copy_on_sanitize)

File ~/.local/lib/python3.10/site-packages/pandas/core/internals/construction.py:570, in _ensure_2d(values) 568 values = values.reshape((values.shape[0], 1)) 569 elif values.ndim != 2: --> 570 raise ValueError(f"Must pass 2-d input. shape={values.shape}") 571 return values

ValueError: Must pass 2-d input. shape=()

From what I see a dimensionality error regarding the cell type column?

Thank you for any help.

Version information


anndata 0.9.1 matplotlib 3.7.2 mudata 0.2.3 numpy 1.24.3 pandas 2.0.3 pertpy 0.4.0 scanpy 1.9.3 scipy 1.11.1 scvi 1.0.2 seaborn 0.12.2 session_info 1.0.0 tensorflow 2.13.0

PIL 10.0.0 absl NA adjustText 0.8 aiohttp 3.8.4 aiosignal 1.3.1 anyio NA arrow 1.2.3 arviz 0.16.1 asttokens NA astunparse 1.6.3 async_timeout 4.0.2 attr 23.1.0 attrs 23.1.0 babel 2.12.1 backcall 0.2.0 backoff 2.2.1 brotli NA bs4 4.12.2 certifi 2023.05.07 cffi 1.15.1 chardet 5.1.0 charset_normalizer 3.2.0 chex 0.1.7 click 8.1.6 cloudpickle 2.2.1 colorama 0.4.6 comm 0.1.3 contextlib2 NA croniter NA cycler 0.10.0 cython_runtime NA dab0eaeee8bfae79490a0d4f23f5ad820bb199d8 NA dateutil 2.8.2 debugpy 1.6.7 decorator 5.1.1 decoupler 1.4.0 deepdiff 6.3.1 defusedxml 0.7.1 docrep 0.3.2 etils 1.3.0 executing 1.2.0 fastapi 0.100.0 fastjsonschema NA flatbuffers 23.5.26 flax 0.7.0 fqdn NA frozenlist 1.4.0 fsspec 2023.6.0 gast NA google NA h5py 3.9.0 idna 3.4 igraph 0.10.6 importlib_resources NA ipykernel 6.24.0 ipython_genutils 0.2.0 ipywidgets 8.0.7 isoduration NA jax 0.4.13 jaxlib 0.4.13 jaxopt NA jedi 0.18.2 jinja2 3.1.2 joblib 1.3.1 json5 NA jsonpointer 2.4 jsonschema 4.18.4 jsonschema_specifications NA jupyter_events 0.6.3 jupyter_server 2.7.0 jupyterlab_server 2.23.0 keras 2.13.1 kiwisolver 1.4.4 leidenalg 0.10.0 lightning 2.0.5 lightning_cloud NA lightning_utilities 0.9.0 llvmlite 0.40.1 louvain 0.8.0 markupsafe 2.1.3 matplotlib_inline 0.1.6 mizani 0.9.2 ml_collections NA ml_dtypes 0.2.0 mpl_toolkits NA mpmath 1.3.0 msgpack 1.0.5 multidict 6.0.4 multipart 0.0.6 multipledispatch 0.6.0 muon 0.1.5 natsort 8.4.0 nbformat 5.9.1 numba 0.57.1 numpyro 0.12.1 nvfuser NA opt_einsum v3.3.0 optax 0.1.5 ordered_set 4.1.0 ott 0.4.2 overrides NA packaging 23.1 parso 0.8.3 patsy 0.5.3 pexpect 4.8.0 pickleshare 0.7.5 pkg_resources NA platformdirs 3.9.1 plotly 5.15.0 plotnine 0.12.1 ply 3.11 prometheus_client NA prompt_toolkit 3.0.39 psutil 5.9.5 ptyprocess 0.7.0 pure_eval 0.2.2 pycparser 2.21 pydantic 1.10.11 pydev_ipython NA pydevconsole NA pydevd 2.9.5 pydevd_file_utils NA pydevd_plugins NA pydevd_tracing NA pygments 2.15.1 pynndescent 0.5.10 pyomo 6.6.1 pyparsing 3.1.0 pypi_latest 0.1.2 pyro 1.8.5 pythonjsonlogger NA pytz 2023.3 questionary 1.10.0 referencing NA requests 2.31.0 rfc3339_validator 0.1.4 rfc3986_validator 0.1.1 rich NA rpds NA rpy2 3.5.13 ruamel NA send2trash NA setuptools 68.0.0 six 1.16.0 sklearn 1.3.0 skmisc 0.3.0 sniffio 1.3.0 socks 1.7.1 soupsieve 2.4.1 sparse 0.14.0 sparsecca 0.3.0 stack_data 0.6.2 starlette 0.27.0 statsmodels 0.14.0 switchlang 0.1.0 sympy 1.12 tensorboard 2.13.0 tensorflow_probability 0.20.1 termcolor NA texttable 1.6.7 threadpoolctl 3.2.0 tomli 2.0.1 toolz 0.12.0 torch 2.0.1+cu117 torchmetrics 1.0.1 tornado 6.3.2 tqdm 4.65.0 traitlets 5.9.0 tree 0.1.8 typing_extensions NA tzdata 2023.3 tzlocal NA umap 0.5.3 uri_template NA urllib3 2.0.3 uvicorn 0.23.1 wcwidth 0.2.6 webcolors 1.13 websocket 1.6.1 websockets 11.0.3 wrapt 1.15.0 xarray 2023.7.0 xarray_einstats 0.6.0 yaml 6.0.1 yarl 1.9.2 zmq 25.1.0 zstandard 0.21.0

IPython 8.14.0 jupyter_client 8.3.0 jupyter_core 5.3.1 jupyterlab 4.0.3 notebook 6.5.4

Python 3.10.9 (main, Jan 11 2023, 15:21:40) [GCC 11.2.0] Linux-5.15.0-67-generic-x86_64-with-glibc2.35

Session information updated at 2023-07-26 10:47

Zethson commented 1 year ago

Think this is already fixed on the development branch. It broke due to a Pandas update. Can you try the development branch? We'll make a release soon.

MaximilianNuber commented 1 year ago

It worked on both datasets. Thank you!