theislab / scarches

Reference mapping for single-cell genomics
https://docs.scarches.org/en/latest/
BSD 3-Clause "New" or "Revised" License
323 stars 50 forks source link

AssertionError: the erro occurs in preparing query feature naming (gene symbols) does not match the reference model feature naming (ensembl IDs ) #200

Open Niubile001 opened 1 year ago

Niubile001 commented 1 year ago

Thank you for the great jobs to the community!

Recently, I followed the example code presented in https://github.com/theislab/scarches/blob/master/notebooks/hlca_map_classify.ipynb to run with my own query Anndata Object. The code works well with the query data you offered but failed with mine. It threw an erro when I run with sum_by function:

Sum any columns with identical gene IDs that have resulted from the mapping. Here we define a short function to do that easily.

def sum_by(adata: ad.AnnData, col: str) -> ad.AnnData: adata.strings_to_categoricals() assert pd.api.types.is_categorical_dtype(adata.obs[col])

 cat = adata.obs[col].values
 indicator = sparse.coo_matrix(
     (np.broadcast_to(True, adata.n_obs), (cat.codes, np.arange(adata.n_obs))),
     shape=(len(cat.categories), adata.n_obs),
 )

 return ad.AnnData(
     indicator @ adata.X, var=adata.var, obs=pd.DataFrame(index=cat.categories)
 )

adata_query_unprep = sum_by(adata_query_unprep.transpose(), col="gene_ids").transpose()

AssertionError Traceback (most recent call last) /tmp/ipykernel_375460/4109603730.py in ----> 1 adata_query_unprep = sum_by(adata_query_unprep.transpose(), col="gene_ids").transpose()

/tmp/ipykernel_375460/1296360838.py in sum_by(adata, col) 1 def sum_by(adata: ad.AnnData, col: str) -> ad.AnnData: 2 adata.strings_to_categoricals() ----> 3 assert pd.api.types.is_categorical_dtype(adata.obs[col]) 4 5 cat = adata.obs[col].values

AssertionError:


The shape of my query Anndata Object (adata_query_unprep) is:

AnnData object with n_obs × n_vars = 902735 × 1915 obs: 'dataset' var: 'gene_names', 'gene_ids'

adata_query_unprep.var.head(5) gene_names gene_ids ENSG00000188290 HES4 ENSG00000188290 ENSG00000187608 ISG15 ENSG00000187608 ENSG00000162571 TTLL10 ENSG00000162571 ENSG00000186891 TNFRSF18 ENSG00000186891 ENSG00000186827 TNFRSF4 ENSG00000186827

The pip list is: Package Version


absl-py 1.4.0 aiohttp 3.8.4 aiosignal 1.3.1 anndata 0.9.1 anyio 3.7.1 appdirs 1.4.4 argon2-cffi 21.3.0 argon2-cffi-bindings 21.2.0 arrow 1.2.3 asttokens 2.2.1 async-timeout 4.0.2 attrs 23.1.0 backcall 0.2.0 backoff 2.2.1 beautifulsoup4 4.12.2 biopython 1.81 biothings-client 0.3.0 bleach 6.0.0 blessed 1.20.0 certifi 2023.5.7 cffi 1.15.1 charset-normalizer 3.2.0 chex 0.1.7 click 8.1.5 cmake 3.26.4 colorama 0.4.6 comm 0.1.3 contextlib2 21.6.0 contourpy 1.1.0 croniter 1.4.1 cycler 0.11.0 dateutils 0.6.12 debugpy 1.6.7 decorator 5.1.1 deepdiff 6.3.1 defusedxml 0.7.1 diskcache 5.6.1 dm-tree 0.1.8 docrep 0.3.2 etils 1.3.0 exceptiongroup 1.1.2 executing 1.2.0 fastapi 0.100.0 fastjsonschema 2.17.1 filelock 3.12.2 flax 0.7.0 fonttools 4.41.0 fqdn 1.5.1 frozenlist 1.4.0 fsspec 2023.6.0 genomepy 0.16.1 h11 0.14.0 h5py 3.9.0 huggingface-hub 0.16.4 idna 3.4 igraph 0.10.5 importlib-resources 6.0.0 inquirer 3.1.3 ipykernel 6.24.0 ipython 8.14.0 ipython-genutils 0.2.0 ipywidgets 8.0.7 isoduration 20.11.0 itsdangerous 2.1.2 jax 0.4.13 jaxlib 0.4.13 jedi 0.18.2 Jinja2 3.1.2 joblib 1.3.1 jsonpointer 2.4 jsonschema 4.18.3 jsonschema-specifications 2023.6.1 jupyter 1.0.0 jupyter_client 8.3.0 jupyter-console 6.6.3 jupyter_core 5.3.1 jupyter-events 0.6.3 jupyter_server 2.7.0 jupyter_server_terminals 0.4.4 jupyterlab-pygments 0.2.2 jupyterlab-widgets 3.0.8 kiwisolver 1.4.4 leidenalg 0.10.0 lightning 2.0.5 lightning-cloud 0.5.37 lightning-utilities 0.9.0 lit 16.0.6 llvmlite 0.40.1 loguru 0.7.0 loompy 3.0.7 markdown-it-py 3.0.0 MarkupSafe 2.1.3 matplotlib 3.7.2 matplotlib-inline 0.1.6 mdurl 0.1.2 mistune 3.0.1 ml-collections 0.1.1 ml-dtypes 0.2.0 mpmath 1.3.0 msgpack 1.0.5 mudata 0.2.3 multidict 6.0.4 multipledispatch 1.0.0 mygene 3.2.2 mysql-connector-python 8.0.33 natsort 8.4.0 nbclassic 1.0.0 nbclient 0.8.0 nbconvert 7.6.0 nbformat 5.9.1 nest-asyncio 1.5.6 networkx 3.1 norns 0.1.6 nose 1.3.7 notebook 6.5.4 notebook_shim 0.2.3 numba 0.57.1 numpy 1.24.4 numpy-groupies 0.9.22 numpyro 0.12.1 nvidia-cublas-cu11 11.10.3.66 nvidia-cuda-cupti-cu11 11.7.101 nvidia-cuda-nvrtc-cu11 11.7.99 nvidia-cuda-runtime-cu11 11.7.99 nvidia-cudnn-cu11 8.5.0.96 nvidia-cufft-cu11 10.9.0.58 nvidia-curand-cu11 10.2.10.91 nvidia-cusolver-cu11 11.4.0.1 nvidia-cusparse-cu11 11.7.4.91 nvidia-nccl-cu11 2.14.3 nvidia-nvtx-cu11 11.7.91 opt-einsum 3.3.0 optax 0.1.5 orbax-checkpoint 0.2.7 ordered-set 4.1.0 overrides 7.3.1 packaging 23.1 pandas 2.0.3 pandocfilters 1.5.0 parso 0.8.3 patsy 0.5.3 pexpect 4.8.0 pickleshare 0.7.5 Pillow 10.0.0 pip 23.1.2 platformdirs 3.8.1 prometheus-client 0.17.1 prompt-toolkit 3.0.39 protobuf 3.20.3 psutil 5.9.5 ptyprocess 0.7.0 pure-eval 0.2.2 pycparser 2.21 pydantic 1.10.11 pyfaidx 0.7.2.1 Pygments 2.15.1 PyJWT 2.7.0 pymde 0.1.18 pynndescent 0.5.10 pyparsing 3.0.9 pyro-api 0.1.2 pyro-ppl 1.8.5 python-dateutil 2.8.2 python-editor 1.0.4 python-igraph 0.10.5 python-json-logger 2.0.7 python-multipart 0.0.6 pytorch-lightning 2.0.5 pytz 2023.3 PyYAML 6.0 pyzmq 25.1.0 qtconsole 5.4.3 QtPy 2.3.1 readchar 4.0.5 referencing 0.29.1 requests 2.31.0 rfc3339-validator 0.1.4 rfc3986-validator 0.1.1 rich 13.4.2 rpds-py 0.8.10 scanpy 1.9.3 scikit-learn 1.3.0 scikit-misc 0.3.0 scipy 1.11.1 scvi-colab 0.12.0 scvi-tools 1.0.2 seaborn 0.12.2 Send2Trash 1.8.2 session-info 1.0.0 setuptools 67.8.0 six 1.16.0 sniffio 1.3.0 soupsieve 2.4.1 sparse 0.14.0 stack-data 0.6.2 starlette 0.27.0 starsessions 1.3.0 statsmodels 0.14.0 stdlib-list 0.9.0 sympy 1.12 tensorstore 0.1.40 terminado 0.17.1 texttable 1.6.7 threadpoolctl 3.2.0 tinycss2 1.2.1 toolz 0.12.0 torch 2.0.1 torchmetrics 1.0.1 torchvision 0.15.2 tornado 6.3.2 tqdm 4.65.0 traitlets 5.9.0 triton 2.0.0 typing_extensions 4.7.1 tzdata 2023.3 umap-learn 0.5.3 uri-template 1.3.0 urllib3 2.0.3 uvicorn 0.23.0 wcwidth 0.2.6 webcolors 1.13 webencodings 0.5.1 websocket-client 1.6.1 websockets 11.0.3 wheel 0.38.4 widgetsnbextension 4.0.8 xarray 2023.6.0 yarl 1.9.2 zipp 3.16.2