aertslab / scenicplus

SCENIC+ is a python package to build gene regulatory networks (GRNs) using combined or separate single-cell gene expression (scRNA-seq) and single-cell chromatin accessibility (scATAC-seq) data.
Other
178 stars 28 forks source link

Unable to complete run_scenicplus run - deprecated numpy.float in numpy ver > 1.24 #162

Open loganminhdang opened 1 year ago

loganminhdang commented 1 year ago

What type of problem are you experiencing and which function is you problem related too My run_scenicplus step failed to complete, specifically when eGRN AUCs are binarized. The error is seemingly due to the depreciation of numpy.float in numpy versions > 1.24. The code and error is provided below:

Code:

from scenicplus.wrappers.run_scenicplus import run_scenicplus try: run_scenicplus( scplus_obj = scplus_obj, variable = ['GEX_celltype'], species = 'mmusculus', assembly = 'mm10', tf_file = 'allTFs_mm.txt', save_path = os.path.join(work_dir, 'scenicplus_results'), biomart_host = biomart_host, upstream = [1000, 150000], downstream = [1000, 150000], calculate_TF_eGRN_correlation = True, calculate_DEGs_DARs = True, export_to_loom_file = True, export_to_UCSC_file = True, path_bedToBigBed = '/mypath/bedToBigBed', n_cpu = 15, _temp_dir = os.path.join(tmp_dir, 'ray_spill')) except Exception as e:

in case of failure, still save the object

dill.dump(scplus_obj, open(os.path.join(work_dir, 'scenicplus/scplus_obj.pkl'), 'wb'), protocol=-1)
raise(e)

Error:

RemoteTraceback Traceback (most recent call last) RemoteTraceback: """ Traceback (most recent call last): File "/well/jagannath/users/lig797/conda/skylake/envs/scenicplus/lib/python3.8/multiprocessing/pool.py", line 125, in worker result = (True, func(*args, **kwds)) File "/well/jagannath/users/lig797/conda/skylake/envs/scenicplus/lib/python3.8/multiprocessing/pool.py", line 51, in starmapstar return list(itertools.starmap(args[0], args[1])) File "/well/jagannath/users/lig797/conda/skylake/envs/scenicplus/lib/python3.8/site-packages/pyscenic/binarization.py", line 56, in derivethreshold if not isbimodal(data, method): File "/well/jagannath/users/lig797/conda/skylake/envs/scenicplus/lib/python3.8/site-packages/pyscenic/binarization.py", line 43, in isbimodal , pval, _ = diptst(np.msort(data)) File "/well/jagannath/users/lig797/conda/skylake/envs/scenicplus/lib/python3.8/site-packages/pyscenic/diptest.py", line 64, in diptst else (np.less(d, unif_dips).sum() + 1) / (np.float(numt) + 1) File "/well/jagannath/users/lig797/conda/skylake/envs/scenicplus/lib/python3.8/site-packages/numpy/init.py", line 305, in getattr raise AttributeError(__former_attrs__[attr]) AttributeError: module 'numpy' has no attribute 'float'. np.float was a deprecated alias for the builtin float. To avoid this error in existing code, use float by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use np.float64 here. The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations """

The above exception was the direct cause of the following exception:

AttributeError Traceback (most recent call last) Cell In[14], line 23 20 except Exception as e: 21 #in case of failure, still save the object 22 dill.dump(scplus_obj, open(os.path.join(work_dir, 'scenicplus/scplus_obj.pkl'), 'wb'), protocol=-1) ---> 23 raise(e)

Cell In[14], line 3 1 from scenicplus.wrappers.run_scenicplus import run_scenicplus 2 try: ----> 3 run_scenicplus( 4 scplus_obj = scplus_obj, 5 variable = ['GEX_celltype'], 6 species = 'mmusculus', 7 assembly = 'mm10', 8 tf_file = '/gpfs3/well/jagannath/users/lig797/SingleCell/scenicplus/allTFs_mm.txt', 9 save_path = os.path.join(work_dir, 'scenicplus_results'), 10 biomart_host = biomart_host, 11 upstream = [1000, 150000], 12 downstream = [1000, 150000], 13 calculate_TF_eGRN_correlation = True, 14 calculate_DEGs_DARs = True, 15 export_to_loom_file = True, 16 export_to_UCSC_file = True, 17 path_bedToBigBed = '/gpfs3/well/jagannath/users/lig797/SingleCell/scenicplus/bedToBigBed', 18 n_cpu = 15, 19 _temp_dir = os.path.join(tmp_dir, 'ray_spill')) 20 except Exception as e: 21 #in case of failure, still save the object 22 dill.dump(scplus_obj, open(os.path.join(work_dir, 'scenicplus/scplus_obj.pkl'), 'wb'), protocol=-1)

File /gpfs3/well/jagannath/users/lig797/SingleCell/scenicplus/src/scenicplus/wrappers/run_scenicplus.py:280, in run_scenicplus(scplus_obj, variable, species, assembly, tf_file, save_path, biomart_host, upstream, downstream, region_ranking, gene_ranking, simplified_eGRN, calculate_TF_eGRN_correlation, calculate_DEGs_DARs, export_to_loom_file, export_to_UCSC_file, tree_structure, path_bedToBigBed, n_cpu, _temp_dir, save_partial, **kwargs) 278 if 'eRegulon_AUC_thresholds' not in scplus_obj.uns.keys(): 279 log.info('Binarizing eGRNs AUC') --> 280 binarize_AUC(scplus_obj, 281 auc_key='eRegulon_AUC', 282 out_key='eRegulon_AUC_thresholds', 283 signature_keys=['Gene_based', 'Region_based'], 284 n_cpu=n_cpu) 286 if not hasattr(scplus_obj, 'dr_cell'): 287 scplus_obj.dr_cell = {}

File /gpfs3/well/jagannath/users/lig797/SingleCell/scenicplus/src/scenicplus/eregulon_enrichment.py:199, in binarize_AUC(scplus_obj, auc_key, out_key, signature_keys, n_cpu) 197 for signature in signature_keys: 198 auc_mtx = scplus_obj.uns[auckey][signature] --> 199 , auc_thresholds = binarize(auc_mtx, num_workers=n_cpu) 200 scplus_obj.uns[out_key][signature] = auc_thresholds

File /well/jagannath/users/lig797/conda/skylake/envs/scenicplus/lib/python3.8/site-packages/pyscenic/binarization.py:94, in binarize(auc_mtx, threshold_overides, seed, num_workers) 89 thrs = p.starmap( 90 derive_threshold, [(auc_mtx, c, seed) for c in auc_mtx.columns] 91 ) 92 return pd.Series(index=auc_mtx.columns, data=thrs) ---> 94 thresholds = derive_thresholds(auc_mtx) 95 if threshold_overides is not None: 96 thresholds[list(threshold_overides.keys())] = list(threshold_overides.values())

File /well/jagannath/users/lig797/conda/skylake/envs/scenicplus/lib/python3.8/site-packages/pyscenic/binarization.py:89, in binarize..derive_thresholds(auc_mtx, seed) 87 def derive_thresholds(auc_mtx, seed=seed): 88 with Pool(processes=num_workers) as p: ---> 89 thrs = p.starmap( 90 derive_threshold, [(auc_mtx, c, seed) for c in auc_mtx.columns] 91 ) 92 return pd.Series(index=auc_mtx.columns, data=thrs)

File /well/jagannath/users/lig797/conda/skylake/envs/scenicplus/lib/python3.8/multiprocessing/pool.py:372, in Pool.starmap(self, func, iterable, chunksize) 366 def starmap(self, func, iterable, chunksize=None): 367 ''' 368 Like map() method but the elements of the iterable are expected to 369 be iterables as well and will be unpacked as arguments. Hence 370 func and (a, b) becomes func(a, b). 371 ''' --> 372 return self._map_async(func, iterable, starmapstar, chunksize).get()

File /well/jagannath/users/lig797/conda/skylake/envs/scenicplus/lib/python3.8/multiprocessing/pool.py:771, in ApplyResult.get(self, timeout) 769 return self._value 770 else: --> 771 raise self._value

File /well/jagannath/users/lig797/conda/skylake/envs/scenicplus/lib/python3.8/multiprocessing/pool.py:125, in worker() 123 job, i, func, args, kwds = task 124 try: --> 125 result = (True, func(*args, **kwds)) 126 except Exception as e: 127 if wrap_exception and func is not _helper_reraises_exception:

File /well/jagannath/users/lig797/conda/skylake/envs/scenicplus/lib/python3.8/multiprocessing/pool.py:51, in starmapstar() 50 def starmapstar(args): ---> 51 return list(itertools.starmap(args[0], args[1]))

File /well/jagannath/users/lig797/conda/skylake/envs/scenicplus/lib/python3.8/site-packages/pyscenic/binarization.py:56, in derive_threshold() 51 gmm1 = mixture.GaussianMixture( 52 n_components=1, covariance_type="full", random_state=seed 53 ).fit(X) 54 return gmm2.bic(X) <= gmm1.bic(X) ---> 56 if not isbimodal(data, method): 57 # For a unimodal distribution the threshold is set as mean plus two standard deviations. 58 return data.mean() + 2.0 * data.std() 59 else: 60 # Fit a two component Gaussian Mixture model on the AUC distribution using an Expectation-Maximization algorithm 61 # to identify the peaks in the distribution.

File /well/jagannath/users/lig797/conda/skylake/envs/scenicplus/lib/python3.8/site-packages/pyscenic/binarization.py:43, in isbimodal() 40 def isbimodal(data, method): 41 if method == "hdt": 42 # Use Hartigan's dip statistic to decide if distribution deviates from unimodality. ---> 43 , pval, = diptst(np.msort(data)) 44 return (pval is not None) and (pval <= 0.05) 45 else: 46 # Compare Bayesian Information Content of two Gaussian Mixture Models.

File /well/jagannath/users/lig797/conda/skylake/envs/scenicplus/lib/python3.8/site-packages/pyscenic/diptest.py:64, in diptst() 58 unif_dips = np.apply_along_axis(dip_fn, 1, unifs, is_hist, True) 60 # count dips greater or equal to d, add 1/1 to prevent a pvalue of 0 61 pval = ( 62 None 63 if unif_dips.sum() == 0 ---> 64 else (np.less(d, unif_dips).sum() + 1) / (np.float(numt) + 1) 65 ) 67 return (d, pval, (len(left) - 1, len(idxs) - len(right)))

File /well/jagannath/users/lig797/conda/skylake/envs/scenicplus/lib/python3.8/site-packages/numpy/init.py:305, in getattr() 300 warnings.warn( 301 f"In the future np.{attr} will be defined as the " 302 "corresponding NumPy scalar.", FutureWarning, stacklevel=2) 304 if attr in former_attrs: --> 305 raise AttributeError(former_attrs[attr]) 307 # Importing Tester requires importing all of UnitTest which is not a 308 # cheap import Since it is mainly used in test suits, we lazy import it 309 # here to save on the order of 10 ms of import time for most users 310 # 311 # The previous way Tester was imported also had a side effect of adding 312 # the full numpy.testing namespace 313 if attr == 'testing':

AttributeError: module 'numpy' has no attribute 'float'. np.float was a deprecated alias for the builtin float. To avoid this error in existing code, use float by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use np.float64 here. The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations

Describe alternatives you've considered Would downgrading numpy to a version <1.24 affect the performance of scenicplus?

Version information Scenicplus: '1.0.1.dev2+g26677cb'; numpy version 1.24.3

TheRaspberryFox commented 1 year ago

I tried downgrading the numpy version to a lower version that allows for the np.float() command to be used (1.19.5). This started to cause conflicts with other packages such as pandas or scanpy. Due to incompatibility with other packages, I found that finding the right package versions would be time consuming and difficult.

I solved the issue simply by looking through all of the code and changing the SCENIC+ code from "np.float" to "float". This allowed for SCENIC+ to successfully finish.

ghuls commented 1 year ago

With Numpy 1.23, the old syntax should still be supported: pip install numpy=1.23

ghuls commented 1 year ago

@TheRaspberryFox Should be fixed in lastest git version: https://github.com/aertslab/scenicplus/commit/3741a4bbb1d8360d769d6aa22d02806e87a33b10

LILI-0000-0002-8173-7367 commented 1 year ago

@ghuls I tried scenicplus.version '1.0.1.dev3+g3741a4b'.

run_scenicplus gave me the error "AttributeError: module 'numpy' has no attribute 'float'."

I would like to know if you can look into it? Thanks.

ghuls commented 1 year ago

@LILI-0000-0002-8173-7367 Are you sure that you installed the latest git version (and not only pulled it)?

LILI-0000-0002-8173-7367 commented 1 year ago

@ghuls yes, I installed the lastest git version.

I noticed that it gave me an error when I use the wrappers.run_scenicplus function. However, I can get away the error following the step-by-step tutorials analysis.

ghuls commented 1 year ago

Latest pySCENIC git version should fix this numpy issue now: https://github.com/aertslab/pySCENIC/commit/31d51a1625f12fb3c6e92bc48ecc9d401524c22a