scverse / scanpy

Single-cell analysis in Python. Scales to >1M cells.
https://scanpy.readthedocs.io
BSD 3-Clause "New" or "Revised" License
1.92k stars 599 forks source link

sc.tl.dpt(adata, n_branchings=2) sometimes works sometimes doesn't #749

Open zonglunli7515 opened 5 years ago

zonglunli7515 commented 5 years ago

A very very strange issue:

I would want to visualize a huge dataset (10000+, 20000+). However, sc.tl.dpt(adata, n_branchings=2) doesn't work. If I happened to select a smaller subgroup, it might work but not all the time.

Thanks in advance for your help.

Allen

gokceneraslan commented 5 years ago

Do you get an exception message or something else? If you can also copy paste the error message here, we can debug it more easily.

zonglunli7515 commented 5 years ago

Do you get an exception message or something else? If you can also copy paste the error message here, we can debug it more easily.

Many thanks for your quick reply! Unfortunately , no visible exception...

My code is as follows:

import velocyto as vcy
import numpy as np
import scanpy as sc
import anndata

vlm = vcy.VelocytoLoom("path of DentateGyrus.loom")
S = vlm.S
S=S.transpose()
adata = anndata.AnnData(S)
print(adata.X)
print(adata.obs)
print(adata.var)

sc.pp.neighbors(adata, n_neighbors=100)
adata.uns['iroot'] = 0
print(adata.uns)
sc.tl.dpt(adata, n_branchings=2)
sc.pl.diffmap(adata, color='dpt_pseudotime', projection='2d')

error message (a number of warnings as well, taking up lots of lines and I have no idea of how to include all of them here...) :

numba warnings ```pytb WARNING: You’re trying to run this on 27998 dimensions of `.X`, if you really want this, set `use_rep='X'`. Falling back to preprocessing with `sc.pp.pca` and default params. /home/liz3/env/lib/python3.6/site-packages/umap/rp_tree.py:450: NumbaWarning: Compilation is falling back to object mode WITH looplifting enabled because Function "make_euclidean_tree" failed type inference due to: Cannot unify RandomProjectionTreeNode(array(int64, 1d, C), bool, none, none, none, none) and RandomProjectionTreeNode(none, bool, array(float32, 1d, C), float64, RandomProjectionTreeNode(array(int64, 1d, C), bool, none, none, none, none), RandomProjectionTreeNode(array(int64, 1d, C), bool, none, none, none, none)) for '$14.16', defined at /home/liz3/env/lib/python3.6/site-packages/umap/rp_tree.py (457) File "env/lib/python3.6/site-packages/umap/rp_tree.py", line 457: def make_euclidean_tree(data, indices, rng_state, leaf_size=30): left_node = make_euclidean_tree(data, left_indices, rng_state, leaf_size) ^ [1] During: resolving callee type: recursive(type(CPUDispatcher())) [2] During: typing of call at /home/liz3/env/lib/python3.6/site-packages/umap/rp_tree.py (457) File "env/lib/python3.6/site-packages/umap/rp_tree.py", line 457: def make_euclidean_tree(data, indices, rng_state, leaf_size=30): left_node = make_euclidean_tree(data, left_indices, rng_state, leaf_size) ^ @numba.jit() /home/liz3/env/lib/python3.6/site-packages/numba/compiler.py:725: NumbaWarning: Function "make_euclidean_tree" was compiled in object mode without forceobj=True. File "env/lib/python3.6/site-packages/umap/rp_tree.py", line 451: @numba.jit() def make_euclidean_tree(data, indices, rng_state, leaf_size=30): ^ self.func_ir.loc)) /home/liz3/env/lib/python3.6/site-packages/numba/compiler.py:734: NumbaDeprecationWarning: Fall-back from the nopython compilation path to the object mode compilation path has been detected, this is deprecated behaviour. For more information visit http://numba.pydata.org/numba-doc/latest/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit File "env/lib/python3.6/site-packages/umap/rp_tree.py", line 451: @numba.jit() def make_euclidean_tree(data, indices, rng_state, leaf_size=30): ^ warnings.warn(errors.NumbaDeprecationWarning(msg, self.func_ir.loc)) /home/liz3/env/lib/python3.6/site-packages/umap/nndescent.py:92: NumbaPerformanceWarning: The keyword argument 'parallel=True' was specified but no transformation for parallel execution was possible. To find out why, try turning on parallel diagnostics, see http://numba.pydata.org/numba-doc/latest/user/parallel.html#diagnostics for help. File "env/lib/python3.6/site-packages/umap/utils.py", line 409: @numba.njit(parallel=True) def build_candidates(current_graph, n_vertices, n_neighbors, max_candidates, rng_state): ^ current_graph, n_vertices, n_neighbors, max_candidates, rng_state /home/liz3/env/lib/python3.6/site-packages/numba/compiler.py:588: NumbaPerformanceWarning: The keyword argument 'parallel=True' was specified but no transformation for parallel execution was possible. To find out why, try turning on parallel diagnostics, see http://numba.pydata.org/numba-doc/latest/user/parallel.html#diagnostics for help. File "env/lib/python3.6/site-packages/umap/nndescent.py", line 47: @numba.njit(parallel=True) def nn_descent( ^ self.func_ir.loc)) /home/liz3/env/lib/python3.6/site-packages/umap/umap_.py:349: NumbaWarning: Compilation is falling back to object mode WITH looplifting enabled because Function "fuzzy_simplicial_set" failed type inference due to: Untyped global name 'nearest_neighbors': cannot determine Numba type of File "env/lib/python3.6/site-packages/umap/umap_.py", line 467: def fuzzy_simplicial_set( if knn_indices is None or knn_dists is None: knn_indices, knn_dists, _ = nearest_neighbors( ^ @numba.jit() /home/liz3/env/lib/python3.6/site-packages/numba/compiler.py:725: NumbaWarning: Function "fuzzy_simplicial_set" was compiled in object mode without forceobj=True. File "env/lib/python3.6/site-packages/umap/umap_.py", line 350: @numba.jit() def fuzzy_simplicial_set( ^ self.func_ir.loc)) /home/liz3/env/lib/python3.6/site-packages/numba/compiler.py:734: NumbaDeprecationWarning: Fall-back from the nopython compilation path to the object mode compilation path has been detected, this is deprecated behaviour. For more information visit http://numba.pydata.org/numba-doc/latest/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit File "env/lib/python3.6/site-packages/umap/umap_.py", line 350: @numba.jit() def fuzzy_simplicial_set( ^ warnings.warn(errors.NumbaDeprecationWarning(msg, self.func_ir.loc)) OrderedDict([('neighbors', {'params': {'n_neighbors': 100, 'method': 'umap', 'metric': 'euclidean'}, 'distances': <18213x18213 sparse matrix of type '' with 1803087 stored elements in Compressed Sparse Row format>, 'connectivities': <18213x18213 sparse matrix of type '' with 2667882 stored elements in Compressed Sparse Row format>}), ('iroot', 0)]) WARNING: Trying to run `tl.dpt` without prior call of `tl.diffmap`. Falling back to `tl.diffmap` with default parameters. WARNING: shifting branching point away from maximal kendall-tau correlation (suppress this with `allow_kendall_tau_shift=False`) WARNING: shifting branching point away from maximal kendall-tau correlation (suppress this with `allow_kendall_tau_shift=False`) WARNING: detected group with only [] cells ```
Traceback ```pytb ValueError Traceback (most recent call last) ~/diffusion_map.py in 57 adata.uns['iroot'] = 0 58 print(adata.uns) ---> 59 sc.tl.dpt(adata, n_branchings=2) 60 sc.pl.diffmap(adata, color='dpt_pseudotime', projection='2d') ~/env/lib/python3.6/site-packages/scanpy/tools/_dpt.py in dpt(adata, n_dcs, n_branchings, min_group_size, allow_kendall_tau_shift, copy) 128 # detect branchings and partition the data into segments 129 if n_branchings > 0: --> 130 dpt.branchings_segments() 131 adata.obs['dpt_groups'] = pd.Categorical( 132 values=dpt.segs_names.astype('U'), ~/env/lib/python3.6/site-packages/scanpy/tools/_dpt.py in branchings_segments(self) 187 for each segment. 188 """ --> 189 self.detect_branchings() 190 self.postprocess_segments() 191 self.set_segs_names() ~/env/lib/python3.6/site-packages/scanpy/tools/_dpt.py in detect_branchings(self) 262 segs_connects, 263 segs_undecided, --> 264 segs_adjacency, iseg, tips3) 265 # store as class members 266 self.segs = segs ~/env/lib/python3.6/site-packages/scanpy/tools/_dpt.py in detect_branching(self, segs, segs_tips, segs_connects, segs_undecided, segs_adjacency, iseg, tips3) 476 # branching on the segment, return the list ssegs of segments that 477 # are defined by splitting this segment --> 478 result = self._detect_branching(Dseg, tips3, seg) 479 ssegs, ssegs_tips, ssegs_adjacency, ssegs_connects, trunk = result 480 # map back to global indices ~/env/lib/python3.6/site-packages/scanpy/tools/_dpt.py in _detect_branching(self, Dseg, tips, seg_reference) 646 if len(np.flatnonzero(newseg)) <= 1: 647 logg.warning(f'detected group with only {np.flatnonzero(newseg)} cells') --> 648 secondtip = newseg[np.argmax(Dseg[tips[inewseg]][newseg])] 649 ssegs_tips.append([tips[inewseg], secondtip]) 650 undecided_cells = np.arange(Dseg.shape[0], dtype=int)[nonunique] ~/env/lib/python3.6/site-packages/numpy/core/fromnumeric.py in argmax(a, axis, out) 1101 1102 """ -> 1103 return _wrapfunc(a, 'argmax', axis=axis, out=out) 1104 1105 ~/env/lib/python3.6/site-packages/numpy/core/fromnumeric.py in _wrapfunc(obj, method, *args, **kwds) 54 def _wrapfunc(obj, method, *args, **kwds): 55 try: ---> 56 return getattr(obj, method)(*args, **kwds) 57 58 # An AttributeError occurs if the object does not have ValueError: attempt to get argmax of an empty sequence ```
biubiu-1 commented 5 years ago

The Numba parallel error also occurred to me.

scottgigante commented 4 years ago

I also got this same ValueError (and the numba warnings too)

dburkhardt commented 4 years ago

This issue is still persistent. I've created a colab notebook that shows the issue on a dataset we subsample to 6000 cells:

https://colab.research.google.com/drive/1QrnDFZ7nDNOLx9gr92eknhKShd2aTIdN

@gokceneraslan can you please throw a "bug" tag on this issue so it gets put in the queue?

ivirshup commented 4 years ago

Here's the AnnData object which will reproduce the error if you call: sc.tl.dpt(adata, n_branchings=N) where N > 3.

@falexwolf, maybe you could help with diagnosis here?

cdedonno commented 4 years ago

Hi everyone, I am having the same issue, with the exact same traceback. Was this ever solved or addressed?