frankligy / scTriangulate

scTriangulate is a Python package to mix-and-match conflicting clustering results in single cell analysis and generate reconciled clustering solutions
MIT License
35 stars 5 forks source link

Bug in Tutorial #24

Open Shunya15 opened 9 months ago

Shunya15 commented 9 months ago

Hello,

I'm going through the tutorial (https://sctriangulate.readthedocs.io/en/latest/tutorial.html), with provided h5 file (http://altanalyze.org/scTriangulate/scRNASeq/pbmc_10k_v3.h5). I use Jupyter notebook (Python 3.9.18) and work in a server - 8 cores and memory of 128GB 3.9.18 | packaged by conda-forge | (main, Dec 23 2023, 16:33:10) [GCC 12.3.0]

Here is the error that I faced. adata = scanpy_recipe(adata,is_log=False,resolutions=[1,2,3],pca_n_comps=50,n_top_genes=3000)


/usr/local/anaconda3-2020/envs/sctriangulate_env/lib/python3.9/site-packages/scanpy-1.9.6-py3.9.egg/scanpy/preprocessing/_highly_variable_genes.py:220: FutureWarning: The default of observed=False is deprecated and will be changed to True in a future version of pandas. Pass observed=False to retain current behavior or observed=True to adopt the future default and silence this warning.
  disp_grouped = df.groupby('mean_bin')['dispersions']

---------------------------------------------------------------------------
TypingError                               Traceback (most recent call last)
Cell In[12], line 1
----> 1 adata = scanpy_recipe(adata,is_log=False,resolutions=[1,2,3],pca_n_comps=50,n_top_genes=3000)

File /usr/local/anaconda3-2020/envs/sctriangulate_env/lib/python3.9/site-packages/sctriangulate-0.13.0-py3.9.egg/sctriangulate/preprocessing.py:466, in scanpy_recipe(adata, species, is_log, resolutions, modality, umap, save, pca_n_comps, n_top_genes)
    464 sc.pp.scale(adata,max_value=10)
    465 sc.tl.pca(adata,n_comps=pca_n_comps)
--> 466 sc.pp.neighbors(adata)
    467 for resolution in resolutions:
    468     sc.tl.leiden(adata,resolution=resolution,key_added='sctri_{}_leiden_{}'.format(modality,resolution))

File /usr/local/anaconda3-2020/envs/sctriangulate_env/lib/python3.9/site-packages/scanpy-1.9.6-py3.9.egg/scanpy/neighbors/__init__.py:148, in neighbors(adata, n_neighbors, n_pcs, use_rep, knn, random_state, method, metric, metric_kwds, key_added, copy)
    146     adata._init_as_actual(adata.copy())
    147 neighbors = Neighbors(adata)
--> 148 neighbors.compute_neighbors(
    149     n_neighbors=n_neighbors,
    150     knn=knn,
    151     n_pcs=n_pcs,
    152     use_rep=use_rep,
    153     method=method,
    154     metric=metric,
    155     metric_kwds=metric_kwds,
    156     random_state=random_state,
    157 )
    159 if key_added is None:
    160     key_added = 'neighbors'

File /usr/local/anaconda3-2020/envs/sctriangulate_env/lib/python3.9/site-packages/scanpy-1.9.6-py3.9.egg/scanpy/neighbors/__init__.py:803, in Neighbors.compute_neighbors(self, n_neighbors, knn, n_pcs, use_rep, method, random_state, write_knn_indices, metric, metric_kwds)
    801     X = pairwise_distances(X, metric=metric, **metric_kwds)
    802     metric = 'precomputed'
--> 803 knn_indices, knn_distances, forest = compute_neighbors_umap(
    804     X, n_neighbors, random_state, metric=metric, metric_kwds=metric_kwds
    805 )
    806 # very cautious here
    807 try:

File /usr/local/anaconda3-2020/envs/sctriangulate_env/lib/python3.9/site-packages/scanpy-1.9.6-py3.9.egg/scanpy/neighbors/__init__.py:314, in compute_neighbors_umap(X, n_neighbors, random_state, metric, metric_kwds, angular, verbose)
    310     from umap.umap_ import nearest_neighbors
    312 random_state = check_random_state(random_state)
--> 314 knn_indices, knn_dists, forest = nearest_neighbors(
    315     X,
    316     n_neighbors,
    317     random_state=random_state,
    318     metric=metric,
    319     metric_kwds=metric_kwds,
    320     angular=angular,
    321     verbose=verbose,
    322 )
    324 return knn_indices, knn_dists, forest

File /usr/local/anaconda3-2020/envs/sctriangulate_env/lib/python3.9/site-packages/umap_learn-0.5.5-py3.9.egg/umap/umap_.py:329, in nearest_neighbors(X, n_neighbors, metric, metric_kwds, angular, random_state, low_memory, use_pynndescent, n_jobs, verbose)
    326     n_trees = min(64, 5 + int(round((X.shape[0]) ** 0.5 / 20.0)))
    327     n_iters = max(5, int(round(np.log2(X.shape[0]))))
--> 329     knn_search_index = NNDescent(
    330         X,
    331         n_neighbors=n_neighbors,
    332         metric=metric,
    333         metric_kwds=metric_kwds,
    334         random_state=random_state,
    335         n_trees=n_trees,
    336         n_iters=n_iters,
    337         max_candidates=60,
    338         low_memory=low_memory,
    339         n_jobs=n_jobs,
    340         verbose=verbose,
    341         compressed=False,
    342     )
    343     knn_indices, knn_dists = knn_search_index.neighbor_graph
    345 if verbose:

File /usr/local/anaconda3-2020/envs/sctriangulate_env/lib/python3.9/site-packages/pynndescent-0.5.11-py3.9.egg/pynndescent/pynndescent_.py:931, in NNDescent.__init__(self, data, metric, metric_kwds, n_neighbors, n_trees, leaf_size, pruning_degree_multiplier, diversify_prob, n_search_trees, tree_init, init_graph, init_dist, random_state, low_memory, max_candidates, max_rptree_depth, n_iters, delta, n_jobs, compressed, parallel_batch_queries, verbose)
    928     if verbose:
    929         print(ts(), "NN descent for", str(n_iters), "iterations")
--> 931     self._neighbor_graph = nn_descent(
    932         self._raw_data,
    933         self.n_neighbors,
    934         self.rng_state,
    935         effective_max_candidates,
    936         self._distance_func,
    937         self.n_iters,
    938         self.delta,
    939         low_memory=self.low_memory,
    940         rp_tree_init=True,
    941         init_graph=_init_graph,
    942         leaf_array=leaf_array,
    943         verbose=verbose,
    944     )
    946 if np.any(self._neighbor_graph[0] < 0):
    947     warn(
    948         "Failed to correctly find n_neighbors for some samples."
    949         " Results may be less than ideal. Try re-running with"
    950         " different parameters."
    951     )

File /usr/local/anaconda3-2020/envs/sctriangulate_env/lib/python3.9/site-packages/numba-0.59.0rc1-py3.9-linux-x86_64.egg/numba/core/dispatcher.py:468, in _DispatcherBase._compile_for_args(self, *args, **kws)
    464         msg = (f"{str(e).rstrip()} \n\nThis error may have been caused "
    465                f"by the following argument(s):\n{args_str}\n")
    466         e.patch_message(msg)
--> 468     error_rewrite(e, 'typing')
    469 except errors.UnsupportedError as e:
    470     # Something unsupported is present in the user code, add help info
    471     error_rewrite(e, 'unsupported_error')

File /usr/local/anaconda3-2020/envs/sctriangulate_env/lib/python3.9/site-packages/numba-0.59.0rc1-py3.9-linux-x86_64.egg/numba/core/dispatcher.py:409, in _DispatcherBase._compile_for_args.<locals>.error_rewrite(e, issue_type)
    407     raise e
    408 else:
--> 409     raise e.with_traceback(None)

TypingError: Failed in nopython mode pipeline (step: nopython frontend)
Failed in nopython mode pipeline (step: nopython frontend)
Failed in nopython mode pipeline (step: nopython frontend)
Internal error at <numba.core.typeinfer.CallConstraint object at 0x2b37c07c41c0>.
Failed in nopython mode pipeline (step: parfor prelowering)
'NoneType' object has no attribute 'name'
During: resolving callee type: type(CPUDispatcher(<function apply_graph_updates_low_memory at 0x2b37af11d310>))
During: typing of call at /usr/local/anaconda3-2020/envs/sctriangulate_env/lib/python3.9/site-packages/pynndescent-0.5.11-py3.9.egg/pynndescent/pynndescent_.py (228)

Enable logging at debug level for details.

File "../../../../usr/local/anaconda3-2020/envs/sctriangulate_env/lib/python3.9/site-packages/pynndescent-0.5.11-py3.9.egg/pynndescent/pynndescent_.py", line 228:
def process_candidates(
    <source elided>

        c += apply_graph_updates_low_memory(current_graph, updates, n_threads)
        ^

During: resolving callee type: type(CPUDispatcher(<function process_candidates at 0x2b37afc1f8b0>))
During: typing of call at /usr/local/anaconda3-2020/envs/sctriangulate_env/lib/python3.9/site-packages/pynndescent-0.5.11-py3.9.egg/pynndescent/pynndescent_.py (258)

During: resolving callee type: type(CPUDispatcher(<function process_candidates at 0x2b37afc1f8b0>))
During: typing of call at /usr/local/anaconda3-2020/envs/sctriangulate_env/lib/python3.9/site-packages/pynndescent-0.5.11-py3.9.egg/pynndescent/pynndescent_.py (258)

File "../../../../usr/local/anaconda3-2020/envs/sctriangulate_env/lib/python3.9/site-packages/pynndescent-0.5.11-py3.9.egg/pynndescent/pynndescent_.py", line 258:
def nn_descent_internal_low_memory_parallel(
    <source elided>

        c = process_candidates(
        ^

During: resolving callee type: type(CPUDispatcher(<function nn_descent_internal_low_memory_parallel at 0x2b37afc1fa60>))
During: typing of call at /usr/local/anaconda3-2020/envs/sctriangulate_env/lib/python3.9/site-packages/pynndescent-0.5.11-py3.9.egg/pynndescent/pynndescent_.py (358)

During: resolving callee type: type(CPUDispatcher(<function nn_descent_internal_low_memory_parallel at 0x2b37afc1fa60>))
During: typing of call at /usr/local/anaconda3-2020/envs/sctriangulate_env/lib/python3.9/site-packages/pynndescent-0.5.11-py3.9.egg/pynndescent/pynndescent_.py (358)

File "../../../../usr/local/anaconda3-2020/envs/sctriangulate_env/lib/python3.9/site-packages/pynndescent-0.5.11-py3.9.egg/pynndescent/pynndescent_.py", line 358:
def nn_descent(
    <source elided>
    if low_memory:
        nn_descent_internal_low_memory_parallel(
        ^

I tried with memory of 300GB, but the same error message appeared.

frankligy commented 9 months ago

Hi @Shunya15,

I just did some google search and seemed to find a similar problem (https://github.com/scverse/scanpy/issues/1652), they suggested to update numba version v0.53 and may solve the problem.

Just some of my personal thoughts, I found it's becoming harder and harder to have one set of python dependencies that are guaranteed to work for every python major verstion, it's a big headable to manually tweaking and debugging.

For myself, I always use a fresh conda envionrment with python3.7 for scTriangulate, and so far 95% of time it works very smoothly. Although other python versions should work as well, it may require some tweaking for your environment.

Let me know if I can help with anything else!

best, Frank