scverse / anndata

Annotated data.
http://anndata.readthedocs.io
BSD 3-Clause "New" or "Revised" License
577 stars 152 forks source link

NotImplementedError: Concat of following not supported: ['csc', 'csc'] for concat_on_disk #1706

Closed tangxj98 closed 2 weeks ago

tangxj98 commented 2 weeks ago

Please make sure these conditions are met

Report

I was trying to concatenate two large scRNA data set using the concat_on_disk. I confirmed that I have both scRNA data as csr_matrix. But I got the following errors saying 'csc','csc' is not supported, which doesn't make sense. I read the h5ad file into the memory and it is csr. But the concat_on_disk would identify it as csc. Could you please provide some hint what might be wrong here?

Code:

# confirm that sc1 and sc2 is csr matrix
>>> from scipy.sparse import csr_matrix
>>> sc1=sc.read("sc1.processed.h5ad")
>>> sc2=sc.read("sc2.processed.h5ad")
>>> isinstance(sc1.X, csr_matrix)
True
>>> isinstance(sc2.X, csr_matrix)
True

ad.experimental.concat_on_disk(dict(sc1="sc1.processed.h5ad",sc2="sc2.processed.h5ad"),out_file="test.h5ad",axis=0,label="source")

Traceback:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>                                                                                                                               
  File "~/myfolder/conda/envs/scvi-env/lib/python3.9/site-packages/anndata/experimental/merge.py", line 652, in concat_on_disk
    _write_concat_mappings(
  File "~/myfolder/conda/envs/scvi-env/lib/python3.9/site-packages/anndata/experimental/merge.py", line 262, in _write_concat_mappings
    _write_concat_sequence(
  File "~/myfolder/conda/envs/scvi-env/lib/python3.9/site-packages/anndata/experimental/merge.py", line 358, in _write_concat_sequence
    _write_concat_arrays(
  File "~/myfolder/conda/envs/scvi-env/lib/python3.9/site-packages/anndata/experimental/merge.py", line 310, in _write_concat_arrays
    raise NotImplementedError(
NotImplementedError: Concat of following not supported: ['csc', 'csc']

Versions

import anndata, session_info; session_info.show(html=False, dependencies=True)

anndata 0.10.9 scanpy 1.10.2 session_info 1.0.0

PIL 10.4.0 array_api_compat 1.8 asciitree NA backports NA cffi 1.17.1 click 8.1.7 cloudpickle 3.0.0 colorama 0.4.6 cycler 0.12.1 cython_runtime NA dask 2024.8.0 dateutil 2.9.0.post0 defusedxml 0.7.1 distributed 2024.8.0 exceptiongroup 1.2.2 h5py 3.11.0 igraph 0.11.6 importlib_metadata NA importlib_resources NA jaraco NA jinja2 3.1.4 joblib 1.4.2 kiwisolver 1.4.7 legacy_api_wrap NA leidenalg 0.10.2 llvmlite 0.43.0 locket NA louvain 0.8.2 markupsafe 2.1.5 matplotlib 3.9.2 more_itertools 10.5.0 mpl_toolkits NA msgpack 1.1.0 natsort 8.4.0 numba 0.60.0 numcodecs 0.12.1 numpy 1.26.4 packaging 24.1 pandas 2.2.2 pkg_resources NA platformdirs 4.3.3 psutil 5.9.8 pyarrow 17.0.0 pyparsing 3.1.4 pytz 2024.2 scipy 1.13.1 six 1.16.0 sklearn 1.5.2 sortedcontainers 2.4.0 sparse 0.15.4 tblib 3.0.0 texttable 1.7.0 threadpoolctl 3.5.0 tlz 0.12.1 toolz 0.12.1 torch 2.4.1+cu121 torchgen NA tornado 6.4.1 tqdm 4.66.5 typing_extensions NA wcwidth 0.2.13 yaml 6.0.2 zarr 2.18.2 zict 3.0.0 zipp NA zoneinfo NA

Python 3.9.19 | packaged by conda-forge | (main, Mar 20 2024, 12:50:21) [GCC 12.3.0] Linux-4.18.0-553.16.1.el8_10.x86_64-x86_64-with-glibc2.28

Session information updated at 2024-10-07 23:23

tangxj98 commented 2 weeks ago

problem solved. It is the layers['counts'] that is csc format.