scverse / scanpy

Single-cell analysis in Python. Scales to >1M cells.
https://scanpy.readthedocs.io
BSD 3-Clause "New" or "Revised" License
1.9k stars 597 forks source link

TypeError: expected dtype object, got 'numpy.dtype[float64]' when running scanpy on scvelo objects #1983

Open leahrosen opened 3 years ago

leahrosen commented 3 years ago

Apologies for all the edits, but I'm stuck on this so have been playing around with it. Basically I'm getting weird errors when running scanpy on scvelo objects. I first had this issue when running sc.pp.pca(adata_panc, n_comps=50), but managed to solve it by previously setting adata_panc.X = np.array(adata_panc.X.todense()). However, I'm now getting the exact same error when running sc.pp.neighbors(adata_panc) and I'm not sure which matrix to test. Any advice would be very much appreciated!

Minimal code sample (that we can copy&paste without having any data)

adata_panc = scv.datasets.pancreas()
scv.pp.filter_and_normalize(adata_panc, n_top_genes=3000, min_shared_counts=20)
del adata_panc.obsm['X_pca']
del adata_panc.obsm['X_umap']
del adata_panc.obsp['distances']
del adata_panc.obsp['connectivities']
adata_panc.X = np.array(adata_panc.X.todense())
sc.pp.pca(adata_panc, n_comps=50)
sc.pp.neighbors(adata_panc)
Filtered out 20801 genes that are detected 20 counts (shared).
Normalized count data: X, spliced, unspliced.
Extracted 3000 highly variable genes.
Logarithmized X.
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
TypeError: expected dtype object, got 'numpy.dtype[float32]'

The above exception was the direct cause of the following exception:

SystemError                               Traceback (most recent call last)
/hps/scratch/lsf_tmpdir/hl-codon-10-04/ipykernel_2322052/531027197.py in <module>
      7 adata_panc.X = np.array(adata_panc.X.todense())
      8 sc.pp.pca(adata_panc, n_comps=50)
----> 9 sc.pp.neighbors(adata_panc)

/hps/software/users/marioni/Leah/miniconda3/envs/scvelo/lib/python3.8/site-packages/scanpy/neighbors/__init__.py in neighbors(adata, n_neighbors, n_pcs, use_rep, knn, random_state, method, metric, metric_kwds, key_added, copy)
    137         adata._init_as_actual(adata.copy())
    138     neighbors = Neighbors(adata)
--> 139     neighbors.compute_neighbors(
    140         n_neighbors=n_neighbors,
    141         knn=knn,

/hps/software/users/marioni/Leah/miniconda3/envs/scvelo/lib/python3.8/site-packages/scanpy/neighbors/__init__.py in compute_neighbors(self, n_neighbors, knn, n_pcs, use_rep, method, random_state, write_knn_indices, metric, metric_kwds)
    806             # we need self._distances also for method == 'gauss' if we didn't
    807             # use dense distances
--> 808             self._distances, self._connectivities = _compute_connectivities_umap(
    809                 knn_indices,
    810                 knn_distances,

/hps/software/users/marioni/Leah/miniconda3/envs/scvelo/lib/python3.8/site-packages/scanpy/neighbors/__init__.py in _compute_connectivities_umap(knn_indices, knn_dists, n_obs, n_neighbors, set_op_mix_ratio, local_connectivity)
    388 
    389     X = coo_matrix(([], ([], [])), shape=(n_obs, 1))
--> 390     connectivities = fuzzy_simplicial_set(
    391         X,
    392         n_neighbors,

/hps/software/users/marioni/Leah/miniconda3/envs/scvelo/lib/python3.8/site-packages/umap/umap_.py in fuzzy_simplicial_set(X, n_neighbors, random_state, metric, metric_kwds, knn_indices, knn_dists, angular, set_op_mix_ratio, local_connectivity, apply_set_operations, verbose)
    600     knn_dists = knn_dists.astype(np.float32)
    601 
--> 602     sigmas, rhos = smooth_knn_dist(
    603         knn_dists, float(n_neighbors), local_connectivity=float(local_connectivity),
    604     )

SystemError: CPUDispatcher(<function smooth_knn_dist at 0x14a113bac160>) returned a result with an error set

time: 4.73 s (started: 2021-08-18 11:47:40 +01:00)

Versions

```pytb WARNING: If you miss a compact list, please try `print_header`! The `sinfo` package has changed name and is now called `session_info` to become more discoverable and self-explanatory. The `sinfo` PyPI package will be kept around to avoid breaking old installs and you can downgrade to 0.3.2 if you want to use it without seeing this message. For the latest features and bug fixes, please install `session_info` instead. The usage and defaults also changed slightly, so please review the latest README at https://gitlab.com/joelostblom/session_info. ----- anndata 0.7.6 scanpy 1.8.1 sinfo 0.3.4 ----- PIL 8.3.1 autotime 0.3.1 backcall 0.2.0 bottleneck 1.3.2 cffi 1.14.6 cycler 0.10.0 cython_runtime NA dateutil 2.8.2 decorator 5.0.9 defusedxml 0.7.1 h5py 2.10.0 igraph 0.9.6 ipykernel 6.0.3 ipython_genutils 0.2.0 jedi 0.18.0 joblib 1.0.1 kiwisolver 1.3.1 leidenalg 0.8.7 llvmlite 0.33.0 loompy 3.0.6 louvain 0.7.0 matplotlib 3.4.2 matplotlib_inline NA mkl 2.4.0 mpl_toolkits NA natsort 7.1.1 numba 0.50.1 numexpr 2.7.3 numpy 1.20.3 numpy_groupies 0.9.13 packaging 21.0 pandas 1.3.0 parso 0.8.2 pexpect 4.8.0 pickleshare 0.7.5 pkg_resources NA prompt_toolkit 3.0.19 ptyprocess 0.7.0 pycparser 2.20 pygments 2.9.0 pyparsing 2.4.7 pytz 2021.1 scipy 1.6.2 scvelo 0.2.3 six 1.16.0 sklearn 0.24.2 storemagic NA tables 3.6.1 texttable 1.6.4 tornado 6.1 traitlets 5.0.5 wcwidth 0.2.5 zipp NA zmq 22.1.0 ----- IPython 7.26.0 jupyter_client 6.1.12 jupyter_core 4.7.1 notebook 6.4.0 ----- Python 3.8.10 | packaged by conda-forge | (default, May 11 2021, 07:01:05) [GCC 9.3.0] Linux-4.18.0-240.22.1.el8_3.x86_64-x86_64-with-glibc2.10 96 logical CPU cores, x86_64 ----- Session information updated at 2021-08-17 16:58 time: 402 ms (started: 2021-08-17 16:58:46 +01:00) ```
leahrosen commented 3 years ago

I just wanted to update that this issue does not depend on scvelo at all, but I can recreate it by just using scanpy. I suspect it is an issue with running umap. I'm using version '0.4.6'. Any help would be much appreciated:

Minimal code sample (that we can copy&paste without having any data)

import os
import scanpy as sc
import numpy as np
import pandas as pd
import copy
import anndata
import matplotlib.pyplot as plt
adata_pbmc3k = sc.datasets.pbmc3k_processed()
#del adata_pbmc3k.obsm['X_pca']
#del adata_pbmc3k.obsm['X_umap']
del adata_pbmc3k.obsp['distances']
del adata_pbmc3k.obsp['connectivities']
#sc.pp.pca(adata_pbmc3k, n_comps=50)
sc.pp.neighbors(adata_pbmc3k)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
TypeError: expected dtype object, got 'numpy.dtype[float32]'

The above exception was the direct cause of the following exception:

SystemError                               Traceback (most recent call last)
/hps/scratch/lsf_tmpdir/hl-codon-13-02/ipykernel_2124423/1009160698.py in <module>
----> 1 sc.pp.neighbors(adata_pbmc3k)

/hps/software/users/marioni/Leah/miniconda3/envs/scvelo/lib/python3.8/site-packages/scanpy/neighbors/__init__.py in neighbors(adata, n_neighbors, n_pcs, use_rep, knn, random_state, method, metric, metric_kwds, key_added, copy)
    137         adata._init_as_actual(adata.copy())
    138     neighbors = Neighbors(adata)
--> 139     neighbors.compute_neighbors(
    140         n_neighbors=n_neighbors,
    141         knn=knn,

/hps/software/users/marioni/Leah/miniconda3/envs/scvelo/lib/python3.8/site-packages/scanpy/neighbors/__init__.py in compute_neighbors(self, n_neighbors, knn, n_pcs, use_rep, method, random_state, write_knn_indices, metric, metric_kwds)
    806             # we need self._distances also for method == 'gauss' if we didn't
    807             # use dense distances
--> 808             self._distances, self._connectivities = _compute_connectivities_umap(
    809                 knn_indices,
    810                 knn_distances,

/hps/software/users/marioni/Leah/miniconda3/envs/scvelo/lib/python3.8/site-packages/scanpy/neighbors/__init__.py in _compute_connectivities_umap(knn_indices, knn_dists, n_obs, n_neighbors, set_op_mix_ratio, local_connectivity)
    388 
    389     X = coo_matrix(([], ([], [])), shape=(n_obs, 1))
--> 390     connectivities = fuzzy_simplicial_set(
    391         X,
    392         n_neighbors,

/hps/software/users/marioni/Leah/miniconda3/envs/scvelo/lib/python3.8/site-packages/umap/umap_.py in fuzzy_simplicial_set(X, n_neighbors, random_state, metric, metric_kwds, knn_indices, knn_dists, angular, set_op_mix_ratio, local_connectivity, apply_set_operations, verbose)
    600     knn_dists = knn_dists.astype(np.float32)
    601 
--> 602     sigmas, rhos = smooth_knn_dist(
    603         knn_dists, float(n_neighbors), local_connectivity=float(local_connectivity),
    604     )

SystemError: CPUDispatcher(<function smooth_knn_dist at 0x150524c6cdc0>) returned a result with an error set

time: 3 s (started: 2021-08-23 11:59:12 +01:00)