scverse / scanpy

Single-cell analysis in Python. Scales to >1M cells.
https://scanpy.readthedocs.io
BSD 3-Clause "New" or "Revised" License
1.9k stars 599 forks source link

rank_genes_group error #1467

Closed ywen1407 closed 4 years ago

ywen1407 commented 4 years ago

Hi all, I am wondering if anyone has had similar situation as mine. After data normalization, batch correction with combat, and work through the pipeline on my own data, I was having issues generating rank gene groups. The error is as below. I understand that there are issues with using highly_variable_genes after combat, and this can be resolved after converting raw data back to sparse matrix using " adata.X = scipy.sparse.csr_matrix(adata.X) ", but this method does not address my error.

Look forward to your response, thanks a lot!

Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.

Minimal code sample (that we can copy&paste without having any data)


sc.tl.rank_genes_groups(all_case,groupby='louvain',method='wilcoxon')
ranking genes
---------------------------------------------------------------------------
LinAlgError                               Traceback (most recent call last)
<ipython-input-16-961d52bd7e16> in <module>()
----> 1 sc.tl.rank_genes_groups(all_case,groupby='louvain',method='wilcoxon')

7 frames
<__array_function__ internals> in matrix_power(*args, **kwargs)

/usr/local/lib/python3.6/dist-packages/numpy/linalg/linalg.py in _assert_stacked_square(*arrays)
    211         m, n = a.shape[-2:]
    212         if m != n:
--> 213             raise LinAlgError('Last 2 dimensions of the array must be square')
    214 
    215 def _assert_finite(*arrays):

LinAlgError: Last 2 dimensions of the array must be square

Versions

scanpy==1.6.0 anndata==0.7.4 umap==0.4.6 numpy==1.18.5 scipy==1.4.1 pandas==1.1.2 scikit-learn==0.22.2.post1 statsmodels==0.10.2 python-igraph==0.8.3 louvain==0.7.0 leidenalg==0.8.2 [Paste the output of scanpy.logging.print_versions() leaving a blank line after the details tag]
LuckyMD commented 4 years ago

Please check this issue: #456

Your data in adata.raw are probably np.matrix. You can either format to np.ndarray or to scipy.sparse.csr_matrix() to solve this. Note you are using adata.raw.X and not adata.X in rank_genes_groups() by default. So your proposed line of code will not solve your error. Please instead use for example:

adata.raw.X = scipy.sparse.csr_matrix(adata.raw.X)