aertslab / scenicplus

SCENIC+ is a python package to build gene regulatory networks (GRNs) using combined or separate single-cell gene expression (scRNA-seq) and single-cell chromatin accessibility (scATAC-seq) data.
Other
178 stars 28 forks source link

ValueError: Sample larger than population or is negative #246

Closed Citugulia40 closed 10 months ago

Citugulia40 commented 11 months ago

Hi,

Thanks for developing scenicplus.

I am at the last part of the tutorial which is

from scenicplus.wrappers.run_scenicplus import run_scenicplus
try:
    run_scenicplus(
        scplus_obj = scplus_obj,
        variable = ['GEX_cell_type_new'],
        species = 'hsapiens',
        assembly = 'hg38',
        tf_file = 'utoronto_human_tfs_v_1.01.txt',
        save_path = os.path.join(work_dir, 'scenicplus'),
        biomart_host = biomart_host,
        upstream = [1000, 150000],
        downstream = [1000, 150000],
        calculate_TF_eGRN_correlation = True,
        calculate_DEGs_DARs = True,
        export_to_loom_file = False,
        export_to_UCSC_file = True,
        path_bedToBigBed = 'data',
        n_cpu = 12,
        _temp_dir = os.path.join(tmpDir, 'ray_spill'))
except Exception as e:
    #in case of failure, still save the object
    dill.dump(scplus_obj, open(os.path.join(work_dir, 'scplus_obj.pkl'), 'wb'), protocol=-1)
    raise(e)

And I am getting an error like

Traceback (most recent call last): File "/home/ccitu/miniconda3/envs/scenicplus/lib/python3.8/site-packages/zmq/eventloop/zmqstream.py", line 556, in _run_callback callback(*args, **kwargs) File "/home/ccitu/miniconda3/envs/scenicplus/lib/python3.8/site-packages/zmq/eventloop/zmqstream.py", line 278, in stream_callback return callback(self, msg) File "/home/ccitu/miniconda3/envs/scenicplus/lib/python3.8/site-packages/notebook/services/kernels/handlers.py", line 467, in _on_zmq_reply msg = self.session.deserialize(fed_msg_list) File "/home/ccitu/miniconda3/envs/scenicplus/lib/python3.8/site-packages/jupyter_client/session.py", line 1057, in deserialize self._add_digest(signature) File "/home/ccitu/miniconda3/envs/scenicplus/lib/python3.8/site-packages/jupyter_client/session.py", line 995, in _add_digest self._cull_digest_history() File "/home/ccitu/miniconda3/envs/scenicplus/lib/python3.8/site-packages/jupyter_client/session.py", line 1007, in _cull_digest_history to_cull = random.sample(tuple(sorted(self.digest_history)), n_to_cull) File "/home/ccitu/miniconda3/envs/scenicplus/lib/python3.8/random.py", line 364, in sample raise ValueError("Sample larger than population or is negative") ValueError: Sample larger than population or is negative

Is there any problem with my object? How can I solve this?

Please help.

Thanks

SeppeDeWinter commented 11 months ago

Hi @Citugulia40

This error is caused by the fact that some of your clusters only have very few cells (less than 5). You circumvent this step by setting calculate_TF_eGRN_correlation to False in run_scenicplus.

This is fixed in the development branch.

Best,

Seppe

Citugulia40 commented 11 months ago

Thank you so much for your reply.

If I set up calculate_TF_eGRN_correlation = False, will I be able to do all the downstream steps provided in the tutorial?

SeppeDeWinter commented 11 months ago

Hi @Citugulia40

Yes, this step is not required. Also you already ran almost the full pipeline so most of the things should already be stored in you scenicplus object.

All the best,

Seppe

Citugulia40 commented 11 months ago

Thank you so much.

Citugulia40 commented 11 months ago

Hi, I ran into another error right now, which is

2023-10-26 08:42:06,630 SCENIC+_wrapper INFO     /data2/ccitu/andi_multiome_filtered/scenicplus folder already exists.
2023-10-26 08:42:06,631 SCENIC+_wrapper INFO     Binarizing eGRNs AUC
2023-10-26 08:51:19,434 SCENIC+_wrapper INFO     Making eGRNs AUC UMAP
---------------------------------------------------------------------------
TypingError                               Traceback (most recent call last)
Cell In[12], line 23
     20 except Exception as e:
     21     #in case of failure, still save the object
     22     dill.dump(scplus_obj, open(os.path.join(work_dir, 'scplus_obj.pkl'), 'wb'), protocol=-1)
---> 23     raise(e)

Cell In[12], line 3
      1 from scenicplus.wrappers.run_scenicplus import run_scenicplus
      2 try:
----> 3     run_scenicplus(
      4         scplus_obj = scplus_obj,
      5         variable = ['GEX_cell_type_new'],
      6         species = 'hsapiens',
      7         assembly = 'hg38',
      8         tf_file = '/data2/ccitu/andi_multiome_filtered/utoronto_human_tfs_v_1.01.txt',
      9         save_path = os.path.join(work_dir, 'scenicplus'),
     10         biomart_host = biomart_host,
     11         upstream = [1000, 150000],
     12         downstream = [1000, 150000],
     13         calculate_TF_eGRN_correlation = False,
     14         calculate_DEGs_DARs = True,
     15         export_to_loom_file = False,
     16         export_to_UCSC_file = True,
     17         path_bedToBigBed = 'data',
     18         n_cpu = 12,
     19         _temp_dir = os.path.join(tmpDir, 'ray_spill'))
     20 except Exception as e:
     21     #in case of failure, still save the object
     22     dill.dump(scplus_obj, open(os.path.join(work_dir, 'scplus_obj.pkl'), 'wb'), protocol=-1)

File /data2/ccitu/software/scenicplus/src/scenicplus/wrappers/run_scenicplus.py:290, in run_scenicplus(scplus_obj, variable, species, assembly, tf_file, save_path, biomart_host, upstream, downstream, region_ranking, gene_ranking, simplified_eGRN, calculate_TF_eGRN_correlation, calculate_DEGs_DARs, export_to_loom_file, export_to_UCSC_file, tree_structure, path_bedToBigBed, n_cpu, _temp_dir, save_partial, **kwargs)
    288 if 'eRegulons_UMAP' not in scplus_obj.dr_cell.keys():
    289     log.info('Making eGRNs AUC UMAP')
--> 290     run_eRegulons_umap(scplus_obj,
    291                scale=True, signature_keys=['Gene_based', 'Region_based'])
    292 if 'eRegulons_tSNE' not in scplus_obj.dr_cell.keys():
    293     log.info('Making eGRNs AUC tSNE')

File /data2/ccitu/software/scenicplus/src/scenicplus/dimensionality_reduction.py:294, in run_eRegulons_umap(scplus_obj, scale, auc_key, signature_keys, reduction_name, random_state, selected_regulons, selected_cells, **kwargs)
    291 data_mat = data_mat.T.fillna(0)
    293 reducer = umap.UMAP(random_state=random_state, **kwargs)
--> 294 embedding = reducer.fit_transform(data_mat)
    295 dr = pd.DataFrame(
    296     embedding,
    297     index=data_names,
    298     columns=[
    299         'UMAP_1',
    300         'UMAP_2'])
    301 if not hasattr(scplus_obj, 'dr_cell'):

File ~/miniconda3/envs/scenicplus/lib/python3.8/site-packages/umap/umap_.py:2772, in UMAP.fit_transform(self, X, y)
   2742 def fit_transform(self, X, y=None):
   2743     """Fit X into an embedded space and return that transformed
   2744     output.
   2745 
   (...)
   2770         Local radii of data points in the embedding (log-transformed).
   2771     """
-> 2772     self.fit(X, y)
   2773     if self.transform_mode == "embedding":
   2774         if self.output_dens:

File ~/miniconda3/envs/scenicplus/lib/python3.8/site-packages/umap/umap_.py:2516, in UMAP.fit(self, X, y)
   2510     nn_metric = self._input_distance_func
   2511 if self.knn_dists is None:
   2512     (
   2513         self._knn_indices,
   2514         self._knn_dists,
   2515         self._knn_search_index,
-> 2516     ) = nearest_neighbors(
   2517         X[index],
   2518         self._n_neighbors,
   2519         nn_metric,
   2520         self._metric_kwds,
   2521         self.angular_rp_forest,
   2522         random_state,
   2523         self.low_memory,
   2524         use_pynndescent=True,
   2525         n_jobs=self.n_jobs,
   2526         verbose=self.verbose,
   2527     )
   2528 else:
   2529     self._knn_indices = self.knn_indices

File ~/miniconda3/envs/scenicplus/lib/python3.8/site-packages/umap/umap_.py:328, in nearest_neighbors(X, n_neighbors, metric, metric_kwds, angular, random_state, low_memory, use_pynndescent, n_jobs, verbose)
    325     n_trees = min(64, 5 + int(round((X.shape[0]) ** 0.5 / 20.0)))
    326     n_iters = max(5, int(round(np.log2(X.shape[0]))))
--> 328     knn_search_index = NNDescent(
    329         X,
    330         n_neighbors=n_neighbors,
    331         metric=metric,
    332         metric_kwds=metric_kwds,
    333         random_state=random_state,
    334         n_trees=n_trees,
    335         n_iters=n_iters,
    336         max_candidates=60,
    337         low_memory=low_memory,
    338         n_jobs=n_jobs,
    339         verbose=verbose,
    340         compressed=False,
    341     )
    342     knn_indices, knn_dists = knn_search_index.neighbor_graph
    344 if verbose:

File ~/miniconda3/envs/scenicplus/lib/python3.8/site-packages/pynndescent/pynndescent_.py:921, in NNDescent.__init__(self, data, metric, metric_kwds, n_neighbors, n_trees, leaf_size, pruning_degree_multiplier, diversify_prob, n_search_trees, tree_init, init_graph, init_dist, random_state, low_memory, max_candidates, n_iters, delta, n_jobs, compressed, parallel_batch_queries, verbose)
    918     if verbose:
    919         print(ts(), "NN descent for", str(n_iters), "iterations")
--> 921     self._neighbor_graph = nn_descent(
    922         self._raw_data,
    923         self.n_neighbors,
    924         self.rng_state,
    925         effective_max_candidates,
    926         self._distance_func,
    927         self.n_iters,
    928         self.delta,
    929         low_memory=self.low_memory,
    930         rp_tree_init=True,
    931         init_graph=_init_graph,
    932         leaf_array=leaf_array,
    933         verbose=verbose,
    934     )
    936 if np.any(self._neighbor_graph[0] < 0):
    937     warn(
    938         "Failed to correctly find n_neighbors for some samples."
    939         " Results may be less than ideal. Try re-running with"
    940         " different parameters."
    941     )

File ~/miniconda3/envs/scenicplus/lib/python3.8/site-packages/numba/core/dispatcher.py:468, in _DispatcherBase._compile_for_args(self, *args, **kws)
    464         msg = (f"{str(e).rstrip()} \n\nThis error may have been caused "
    465                f"by the following argument(s):\n{args_str}\n")
    466         e.patch_message(msg)
--> 468     error_rewrite(e, 'typing')
    469 except errors.UnsupportedError as e:
    470     # Something unsupported is present in the user code, add help info
    471     error_rewrite(e, 'unsupported_error')

File ~/miniconda3/envs/scenicplus/lib/python3.8/site-packages/numba/core/dispatcher.py:409, in _DispatcherBase._compile_for_args.<locals>.error_rewrite(e, issue_type)
    407     raise e
    408 else:
--> 409     raise e.with_traceback(None)

TypingError: Failed in nopython mode pipeline (step: nopython frontend)
Failed in nopython mode pipeline (step: nopython frontend)
Untyped global name 'print': Cannot determine Numba type of <class 'function'>

File "../../../home/ccitu/miniconda3/envs/scenicplus/lib/python3.8/site-packages/pynndescent/pynndescent_.py", line 252:
def nn_descent_internal_low_memory_parallel(
    <source elided>
        if verbose:
            print("\t", n + 1, " / ", n_iters)
            ^

During: resolving callee type: type(CPUDispatcher(<function nn_descent_internal_low_memory_parallel at 0x7fcac7180a60>))
During: typing of call at /home/ccitu/miniconda3/envs/scenicplus/lib/python3.8/site-packages/pynndescent/pynndescent_.py (358)

During: resolving callee type: type(CPUDispatcher(<function nn_descent_internal_low_memory_parallel at 0x7fcac7180a60>))
During: typing of call at /home/ccitu/miniconda3/envs/scenicplus/lib/python3.8/site-packages/pynndescent/pynndescent_.py (358)

File "../../../home/ccitu/miniconda3/envs/scenicplus/lib/python3.8/site-packages/pynndescent/pynndescent_.py", line 358:
def nn_descent(
    <source elided>
    if low_memory:
        nn_descent_internal_low_memory_parallel(

Will I need to set nopython=True in any of these files?

SeppeDeWinter commented 11 months ago

Hi @Citugulia40

What is your numba version?

All the best,

Seppe

daccachejoe commented 10 months ago

HI- I wanted to ask for an update on this issue as I am encountering the same error. My numba version is 0.57.1, is that correct? Thanks!

SeppeDeWinter commented 10 months ago

Hi @daccachejoe

Please see this issue: https://github.com/aertslab/scenicplus/issues/203

All the best,

Seppe