WangPeng-Lab / scGCO

Single-cell Graph Cuts Optimization
MIT License
14 stars 4 forks source link

errors when calling identify_spatial_genes #3

Open Cai-98 opened 1 year ago

Cai-98 commented 1 year ago

While using scGCO, encountered the following error

code is running in Python 3.9

gmmDict=scGCO.gmm_model(data_norm)
result_df= identify_spatial_genes(locs, data_norm, cellGraph ,gmmDict)

Traceback :

---------------------------------------------------------------------------
RemoteTraceback                           Traceback (most recent call last)
RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/share/anaconda3/envs/SpaBench/lib/python3.9/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/share/anaconda3/envs/SpaBench/lib/python3.9/multiprocessing/pool.py", line 48, in mapstar
    return list(map(*args))
  File "/share/anaconda3/envs/SpaBench/lib/python3.9/site-packages/parmap/parmap.py", line 105, in _func_star_many
    return func_items_args[0](*list(func_items_args[1]) + func_items_args[2],
  File "/share/anaconda3/envs/SpaBench/lib/python3.9/site-packages/scGCO/Graph_cut.py", line 640, in compute_spatial_genomewise_optimize
    newLabels, thresholds,label_pred = cut_graph_general_otsu(cellGraph, exp, unary_scale_factor,
  File "/share/anaconda3/envs/SpaBench/lib/python3.9/site-packages/scGCO/Graph_cut.py", line 1348, in cut_graph_general_otsu
    pair_size = thresholds.shape[0] + 1
IndexError: tuple index out of range
"""

The above exception was the direct cause of the following exception:

IndexError                                Traceback (most recent call last)
Cell In [14], line 3
      1 import scGCO
      2 gmmDict=scGCO.gmm_model(data_norm)
----> 3 result_df= identify_spatial_genes(locs, data_norm, cellGraph ,gmmDict)

File /share/anaconda3/envs/SpaBench/lib/python3.9/site-packages/scGCO/Graph_cut.py:729, in identify_spatial_genes(locs, data_norm, cellGraph, gmmDict, smooth_factor, unary_scale_factor, label_cost, algorithm, ncores)
    716     ttt = np.array_split(data_norm,num_cores,axis=1)
    717     tuples = [(l, d, c, g,ww,nn, s, u, b, a) for l, d, c, g,ww,nn, s, u, b, a in zip(
    718                                     repeat(locs, num_cores), 
    719                                     ttt,
   (...)
    726                                     repeat(label_cost, num_cores),
    727                                     repeat(algorithm, num_cores))] 
--> 729     results = parmap.starmap(compute_spatial_genomewise_optimize, tuples,
    730                                 pm_processes=num_cores, pm_pbar=True)
    732 #    pool.close()
    733 # p_values, genes, diff_p_values, exp_diff, smooth_factors, pred_labels, model_results
    734     nnn = [results[i][0] for i in np.arange(len(results))]

File /share/anaconda3/envs/SpaBench/lib/python3.9/site-packages/parmap/parmap.py:317, in starmap(function, iterables, *args, **kwargs)
    299 def starmap(function, iterables, *args, **kwargs):
    300     """ Equivalent to:
    301             >>> return ([function(x1,x2,x3,..., args[0], args[1],...) for
    302             >>>         (x1,x2,x3...) in iterable])
   (...)
    315        :type pm_pbar: bool or dict
    316     """
--> 317     return _map_or_starmap(function, iterables, args, kwargs, "starmap")

File /share/anaconda3/envs/SpaBench/lib/python3.9/site-packages/parmap/parmap.py:273, in _map_or_starmap(function, iterable, args, kwargs, map_or_starmap)
    271     _do_pbar(result, num_tasks, chunksize, tqdm_options=tqdm_options)
    272 finally:
--> 273     output = result.get()
    274     if close_pool:
    275         pool.join()

File /share/anaconda3/envs/SpaBench/lib/python3.9/multiprocessing/pool.py:771, in ApplyResult.get(self, timeout)
    769     return self._value
    770 else:
--> 771     raise self._value

IndexError: tuple index out of range

Other warnings when running preprocess steps

/share/anaconda3/envs/SpaBench/lib/python3.9/site-packages/tqdm/auto.py:22: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm
/share/anaconda3/envs/SpaBench/lib/python3.9/site-packages/pysal/lib/weights/util.py:19: UserWarning: geopandas not available. Some functionality will be disabled.
  warn('geopandas not available. Some functionality will be disabled.')
/share/anaconda3/envs/SpaBench/lib/python3.9/site-packages/pysal/model/spvcm/abstracts.py:10: UserWarning: The `dill` module is required to use the sqlite backend fully.
  from .sqlite import head_to_sql, start_sql
2022-10-23 09:05:32.689209: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE4.1 SSE4.2 AVX AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
/share/anaconda3/envs/SpaBench/lib/python3.9/site-packages/scGCO/Preprocessing.py:170: FutureWarning: Support for multi-dimensional indexing (e.g. `obj[:, None]`) is deprecated and will be removed in a future version.  Convert to a numpy array before indexing instead.
  data = pd.DataFrame(data.values/normalizing_factor[:,np.newaxis], columns=data.columns, index=data.index)

And another error is multi-threading doesn't work on my linux server, as mentioned in #2 .

fengwanwan commented 1 year ago

Hi Thank you for your interest in our study. Have you solved the problem yet , which scGCO version did you run ?

Best wishes Wanwan

Cai-98 commented 1 year ago

Hi Thank you for your interest in our study. Have you solved the problem yet , which scGCO version did you run ?

Best wishes Wanwan

I have no idea of how to solve the problem. BTW, the code works well on demo data in Windows but not in Linux.

scGCO version is 1.1.2.

fengwanwan commented 1 year ago

I guess that the problem was caused by the skimage.filters.threshold_otsu. If possible, you run it on the tiny sampling dataset or sent the tiny dataset to me? For the problem in Linux, this is caused by too many cores of running the code.

Cai-98 commented 1 year ago

I've sent you a tiny dataset.