Teichlab / MultiMAP

MultiMAP for integration of single cell multi-omics
MIT License
52 stars 11 forks source link

ATAC-ADT integration error #10

Open szalata opened 1 year ago

szalata commented 1 year ago

When I attempt trimodal rna-atac-adt or bimodal atac-adt integration, MultiMAP.Integration fails with the error: `--------------------------------------------------------------------------- ValueError Traceback (most recent call last)

in 1 #adata = MultiMAP.Integration([atac_mapped[:, ~atac_mapped.var.index.duplicated()], rna], ['X_lsi', 'X_pca']) 2 # adata_atac_adt = MultiMAP.Integration([adt, rna], ['X_pca', 'X_pca']) ----> 3 adata_atac_adt = MultiMAP.Integration([atac_no_dup, adt], ['X_lsi', 'X_pca']) ~/miniconda3/envs/multimil/lib/python3.7/site-packages/MultiMAP/__init__.py in Integration(adatas, use_reps, scale, embedding, seed, **kwargs) 194 #make one happy collapsed object and shove the stuff in correct places 195 #outer join to capture as much gene information as possible for annotation --> 196 adata = anndata.concat(adatas, join='outer') 197 if embedding: 198 adata.obsm['X_multimap'] = mmp[2] ~/miniconda3/envs/multimil/lib/python3.7/site-packages/anndata/_core/merge.py in concat(adatas, axis, join, merge, uns_merge, label, keys, index_unique, fill_value, pairwise) 892 [getattr(a, f"{dim}m") for a in adatas], 893 index=concat_indices, --> 894 fill_value=fill_value, 895 ) 896 if pairwise: ~/miniconda3/envs/multimil/lib/python3.7/site-packages/anndata/_core/merge.py in outer_concat_aligned_mapping(mappings, reindexers, index, fill_value, axis) 532 axis=axis, 533 index=index, --> 534 fill_value=fill_value, 535 ) 536 return result ~/miniconda3/envs/multimil/lib/python3.7/site-packages/anndata/_core/merge.py in concat_arrays(arrays, reindexers, axis, index, fill_value) 433 [f(x) for f, x in zip(reindexers, arrays)], ignore_index=True, axis=axis 434 ) --> 435 df.index = index 436 return df 437 elif any(isinstance(a, sparse.spmatrix) for a in arrays): ~/miniconda3/envs/multimil/lib/python3.7/site-packages/pandas/core/generic.py in __setattr__(self, name, value) 5498 try: 5499 object.__getattribute__(self, name) -> 5500 return object.__setattr__(self, name, value) 5501 except AttributeError: 5502 pass ~/miniconda3/envs/multimil/lib/python3.7/site-packages/pandas/_libs/properties.pyx in pandas._libs.properties.AxisProperty.__set__() ~/miniconda3/envs/multimil/lib/python3.7/site-packages/pandas/core/generic.py in _set_axis(self, axis, labels) 764 def _set_axis(self, axis: int, labels: Index) -> None: 765 labels = ensure_index(labels) --> 766 self._mgr.set_axis(axis, labels) 767 self._clear_item_cache() 768 ~/miniconda3/envs/multimil/lib/python3.7/site-packages/pandas/core/internals/managers.py in set_axis(self, axis, new_labels) 214 def set_axis(self, axis: int, new_labels: Index) -> None: 215 # Caller is responsible for ensuring we have an Index object. --> 216 self._validate_set_axis(axis, new_labels) 217 self.axes[axis] = new_labels 218 ~/miniconda3/envs/multimil/lib/python3.7/site-packages/pandas/core/internals/base.py in _validate_set_axis(self, axis, new_labels) 56 elif new_len != old_len: 57 raise ValueError( ---> 58 f"Length mismatch: Expected axis has {old_len} elements, new " 59 f"values have {new_len} elements" 60 ) ValueError: Length mismatch: Expected axis has 180522 elements, new values have 159510 elements ` That's the line I run `MultiMAP.Integration([atac_no_dup, adt], ['X_lsi', 'X_pca'])` and here's a link to the atac_no_dup and adt files: [atac](https://easyupload.io/qi3r0p), [adt](https://easyupload.io/jviw29) . The data comes from [2021 NeurIPS challenge](https://openproblems.bio/neurips_2021/). rna-atac and rna-adt execute without issues. Python 3.7.16 `$ pip freeze anndata==0.8.0 annoy==1.17.1 anyio @ file:///home/conda/feedstock_root/build_artifacts/anyio_1666191106763/work/dist argon2-cffi @ file:///home/conda/feedstock_root/build_artifacts/argon2-cffi_1640817743617/work argon2-cffi-bindings @ file:///home/conda/feedstock_root/build_artifacts/argon2-cffi-bindings_1649500320262/work attrs @ file:///home/conda/feedstock_root/build_artifacts/attrs_1671632566681/work Babel @ file:///home/conda/feedstock_root/build_artifacts/babel_1667688356751/work backcall @ file:///home/conda/feedstock_root/build_artifacts/backcall_1592338393461/work backports.functools-lru-cache @ file:///home/conda/feedstock_root/build_artifacts/backports.functools_lru_cache_1618230623929/work bamnostic==1.1.8 beautifulsoup4 @ file:///home/conda/feedstock_root/build_artifacts/beautifulsoup4_1649463573192/work bleach @ file:///home/conda/feedstock_root/build_artifacts/bleach_1674535352125/work brotlipy @ file:///home/conda/feedstock_root/build_artifacts/brotlipy_1648854164153/work certifi==2022.12.7 cffi @ file:///home/conda/feedstock_root/build_artifacts/cffi_1636046052501/work charset-normalizer @ file:///home/conda/feedstock_root/build_artifacts/charset-normalizer_1661170624537/work colorama==0.4.6 cryptography @ file:///home/conda/feedstock_root/build_artifacts/cryptography_1637687023717/work cycler==0.11.0 debugpy==1.6.6 decorator @ file:///home/conda/feedstock_root/build_artifacts/decorator_1641555617451/work defusedxml @ file:///home/conda/feedstock_root/build_artifacts/defusedxml_1615232257335/work dnspython==2.3.0 docopt==0.6.2 dunamai==1.15.0 entrypoints @ file:///home/conda/feedstock_root/build_artifacts/entrypoints_1643888246732/work episcanpy==0.4.0 fastjsonschema @ file:///home/conda/feedstock_root/build_artifacts/python-fastjsonschema_1663619548554/work/dist flit_core @ file:///home/conda/feedstock_root/build_artifacts/flit-core_1667734568827/work/source/flit_core fonttools==4.38.0 get_version==3.5.4 gitdb==4.0.10 GitPython==3.1.30 h5py==3.8.0 idna @ file:///home/conda/feedstock_root/build_artifacts/idna_1663625384323/work importlib-metadata @ file:///home/conda/feedstock_root/build_artifacts/importlib-metadata_1653252814274/work importlib-resources @ file:///home/conda/feedstock_root/build_artifacts/importlib_resources_1672681417544/work intervaltree==3.1.0 ipykernel @ file:///home/conda/feedstock_root/build_artifacts/ipykernel_1620912939357/work/dist/ipykernel-5.5.5-py3-none-any.whl ipython @ file:///home/conda/feedstock_root/build_artifacts/ipython_1651240553635/work ipython-genutils==0.2.0 jedi @ file:///home/conda/feedstock_root/build_artifacts/jedi_1669134318875/work Jinja2 @ file:///home/conda/feedstock_root/build_artifacts/jinja2_1654302431367/work joblib==1.2.0 json5 @ file:///home/conda/feedstock_root/build_artifacts/json5_1600692310011/work jsonpickle==1.5.2 jsonschema @ file:///home/conda/feedstock_root/build_artifacts/jsonschema-meta_1669810440410/work jupyter-client @ file:///home/conda/feedstock_root/build_artifacts/jupyter_client_1633454794268/work jupyter-server @ file:///croot/jupyter_server_1671707632269/work jupyter_core @ file:///opt/conda/conda-bld/jupyter_core_1664917302524/work jupyterlab @ file:///home/conda/feedstock_root/build_artifacts/jupyterlab_1674494302491/work jupyterlab-pygments @ file:///home/conda/feedstock_root/build_artifacts/jupyterlab_pygments_1649936611996/work jupyterlab_server @ file:///home/conda/feedstock_root/build_artifacts/jupyterlab_server_1673884094295/work kiwisolver==1.4.4 kneed==0.8.2 legacy-api-wrap==1.2 llvmlite==0.35.0 MarkupSafe @ file:///opt/conda/conda-bld/markupsafe_1654597864307/work matplotlib==3.5.3 matplotlib-inline @ file:///home/conda/feedstock_root/build_artifacts/matplotlib-inline_1660814786464/work mistune @ file:///home/conda/feedstock_root/build_artifacts/mistune_1657892024508/work MultiMAP @ file:///lustre/groups/ml01/code/artur.szalata/MultiMAP munch==2.5.0 natsort==8.2.0 nbclassic @ file:///home/conda/feedstock_root/build_artifacts/nbclassic_1667492839781/work nbclient @ file:///home/conda/feedstock_root/build_artifacts/nbclient_1662750566673/work nbconvert @ file:///home/conda/feedstock_root/build_artifacts/nbconvert-meta_1673893169158/work nbformat @ file:///home/conda/feedstock_root/build_artifacts/nbformat_1673560067442/work nest-asyncio @ file:///home/conda/feedstock_root/build_artifacts/nest-asyncio_1664684991461/work networkx==2.6.3 notebook @ file:///home/conda/feedstock_root/build_artifacts/notebook_1667565639349/work notebook_shim @ file:///home/conda/feedstock_root/build_artifacts/notebook-shim_1667478401171/work numba==0.52.0 numexpr==2.8.4 numpy==1.21.6 packaging @ file:///home/conda/feedstock_root/build_artifacts/packaging_1673482170163/work pandas==1.3.5 pandocfilters @ file:///home/conda/feedstock_root/build_artifacts/pandocfilters_1631603243851/work parso @ file:///home/conda/feedstock_root/build_artifacts/parso_1638334955874/work patsy==0.5.3 pexpect @ file:///home/conda/feedstock_root/build_artifacts/pexpect_1667297516076/work pickleshare @ file:///home/conda/feedstock_root/build_artifacts/pickleshare_1602536217715/work Pillow==9.4.0 pkgutil_resolve_name @ file:///home/conda/feedstock_root/build_artifacts/pkgutil-resolve-name_1633981968097/work prometheus-client @ file:///home/conda/feedstock_root/build_artifacts/prometheus_client_1674535637125/work prompt-toolkit @ file:///home/conda/feedstock_root/build_artifacts/prompt-toolkit_1670414775770/work ptyprocess @ file:///home/conda/feedstock_root/build_artifacts/ptyprocess_1609419310487/work/dist/ptyprocess-0.7.0-py2.py3-none-any.whl py-cpuinfo==9.0.0 pycparser @ file:///home/conda/feedstock_root/build_artifacts/pycparser_1636257122734/work Pygments @ file:///home/conda/feedstock_root/build_artifacts/pygments_1672682006896/work pymongo==4.3.3 pynndescent==0.5.8 pyOpenSSL @ file:///home/conda/feedstock_root/build_artifacts/pyopenssl_1608055815057/work pyparsing==3.0.9 pyrsistent @ file:///tmp/build/80754af9/pyrsistent_1636098896055/work pysam==0.20.0 PySocks @ file:///home/conda/feedstock_root/build_artifacts/pysocks_1648857264451/work python-dateutil @ file:///home/conda/feedstock_root/build_artifacts/python-dateutil_1626286286081/work pytz @ file:///home/conda/feedstock_root/build_artifacts/pytz_1673864280276/work PyYAML==6.0 pyzmq==19.0.2 requests @ file:///home/conda/feedstock_root/build_artifacts/requests_1673863902341/work sacred==0.8.2 scanpy==1.9.1 scikit-learn==1.0.2 scipy==1.7.3 seaborn==0.12.2 seml==0.3.7 Send2Trash @ file:///home/conda/feedstock_root/build_artifacts/send2trash_1628511208346/work session-info==1.0.0 setuptools-scm==7.1.0 sinfo==0.3.4 six @ file:///home/conda/feedstock_root/build_artifacts/six_1620240208055/work smmap==5.0.0 sniffio @ file:///home/conda/feedstock_root/build_artifacts/sniffio_1662051266223/work sortedcontainers==2.4.0 soupsieve @ file:///home/conda/feedstock_root/build_artifacts/soupsieve_1658207591808/work statsmodels==0.13.5 stdlib-list==0.8.0 tables==3.7.0 tbb==2021.8.0 terminado @ file:///home/conda/feedstock_root/build_artifacts/terminado_1670253674810/work threadpoolctl==3.1.0 tinycss2 @ file:///home/conda/feedstock_root/build_artifacts/tinycss2_1666100256010/work tomli @ file:///home/conda/feedstock_root/build_artifacts/tomli_1644342247877/work tornado @ file:///home/conda/feedstock_root/build_artifacts/tornado_1648827244717/work tqdm==4.64.1 traitlets @ file:///home/conda/feedstock_root/build_artifacts/traitlets_1673359992537/work typing_extensions @ file:///home/conda/feedstock_root/build_artifacts/typing_extensions_1665144421445/work umap-learn==0.5.3 urllib3 @ file:///home/conda/feedstock_root/build_artifacts/urllib3_1673452138552/work wcwidth @ file:///home/conda/feedstock_root/build_artifacts/wcwidth_1673864653149/work webencodings==0.5.1 websocket-client @ file:///home/conda/feedstock_root/build_artifacts/websocket-client_1667568040382/work wrapt==1.14.1 zipp @ file:///home/conda/feedstock_root/build_artifacts/zipp_1669453021653/work`
ktpolanski commented 1 year ago

Can replicate. Not explicitly a MultiMAP issue - this is directly calling anndata.concat() on the list of input adatas. Bizarrely, [adt, atac_no_dup] works just fine, but [atac_no_dup, adt] crashes out.

szalata commented 1 year ago

Sadly, this issue occurs also in the full dataset trimodal integration and changing the order doesn't help

ktpolanski commented 1 year ago

Give the anndata folks a nudge, as they're the ones that would need to help.