aertslab / scenicplus

SCENIC+ is a python package to build gene regulatory networks (GRNs) using combined or separate single-cell gene expression (scRNA-seq) and single-cell chromatin accessibility (scATAC-seq) data.
Other
178 stars 28 forks source link

run_scenicplus errors with differnet cpu settings. #280

Closed redvoidling closed 8 months ago

redvoidling commented 8 months ago

Describe the bug run_scenicplus does not finish.

To not is that I removed some of the DAR markers prior in the pipeline because they were empty:

for DAR in markers_dict.keys():
    if len(markers_dict[DAR])>0:
        print(DAR)
        regions = markers_dict[DAR].index[markers_dict[DAR].index.str.startswith('chr')] #only keep regions on known chromosomes
        region_sets['DARs'][DAR] = pr.PyRanges(region_names_to_coordinates(regions))

To Reproduce This is the output of run_scenicplus

with calculate_TF_eGRN_correlation = True and n_cpu = 1

The second output is with: calculate_TF_eGRN_correlation = False and n_cpu = 48

Error output

First Output:

100%|██████████| 124/124 [00:00<00:00, 129.55it/s]2024-01-06 20:29:07,411 SCENIC+_wrapper INFO Calculating TF-eGRNs AUC correlation

Traceback (most recent call last): File "/home/mh281129/thesis-proto-pipeline/slurm_jobs/scenic_run_eGRN_1cpu_only_mut.py", line 84, in raise(e) File "/home/mh281129/thesis-proto-pipeline/slurm_jobs/scenic_run_eGRN_1cpu_only_mut.py", line 64, in run_scenicplus( File "/rwthfs/rz/cluster/home/mh281129/git_pulls/scenicplus/src/scenicplus/wrappers/run_scenicplus.py", line 253, in run_scenicplus generate_pseudobulks(scplus_obj, File "/rwthfs/rz/cluster/home/mh281129/git_pulls/scenicplus/src/scenicplus/cistromes.py", line 227, in generate_pseudobulks sample_cells = sample(cells, nr_cells) File "/home/mh281129/miniconda3/envs/scenicplus/lib/python3.8/random.py", line 363, in sample raise ValueError("Sample larger than population or is negative") ValueError: Sample larger than population or is negative

Second output:

Running using 48 cores: 100%|██████████| 19992/19992 [07:32<00:00, 44.16it/s] 2024-01-06 17:25:39,568 GSEA INFO Subsetting on adjusted pvalue: 1, minimal NES: 0 and minimal leading edge genes 10 2024-01-06 17:25:39,926 GSEA INFO Merging eRegulons 2024-01-06 17:25:40,028 GSEA INFO Storing eRegulons in .uns[eRegulons]. 2024-01-06 17:25:53,190 SCENIC+_wrapper INFO Formatting eGRNs 2024-01-06 17:26:18,721 SCENIC+_wrapper INFO Converting eGRNs to signatures 2024-01-06 17:26:19,142 SCENIC+_wrapper INFO Calculating eGRNs AUC 2024-01-06 17:26:19,142 SCENIC+_wrapper INFO Calculating region ranking 2024-01-06 17:26:48,594 SCENIC+_wrapper INFO Calculating eGRNs region based AUC

(_ray_run_gsea_for_e_module pid=238251) /home/mh281129/miniconda3/envs/scenicplus/lib/python3.8/site-packages/gseapy/algorithm.py:71: RuntimeWarning: divide by zero encountered in divide [repeated 10x across cluster] (_ray_run_gsea_for_e_module pid=238251) norm_tag = 1.0/sum_correl_tag [repeated 9x across cluster] (_ray_run_gsea_for_e_module pid=238251) /home/mh281129/miniconda3/envs/scenicplus/lib/python3.8/site-packages/gseapy/algorithm.py:74: RuntimeWarning: invalid value encountered in multiply [repeated 10x across cluster] (_ray_run_gsea_for_e_module pid=238251) RES = np.cumsum(tag_indicator correl_vector norm_tag - no_tag_indicator * norm_no_tag, axis=axis) [repeated 10x across cluster] 2024-01-06 17:27:10,019 SCENIC+_wrapper INFO Calculating gene ranking 2024-01-06 17:27:12,241 SCENIC+_wrapper INFO Calculating eGRNs gene based AUC 2024-01-06 17:27:32,063 SCENIC+_wrapper INFO Binarizing eGRNs AUC 2024-01-06 17:28:49,011 SCENIC+_wrapper INFO Making eGRNs AUC UMAP 2024-01-06 17:28:56,443 SCENIC+_wrapper INFO Making eGRNs AUC tSNE 2024-01-06 17:29:00,658 SCENIC+_wrapper INFO Calculating eRSS Traceback (most recent call last): File "/home/mh281129/thesis-proto-pipeline/slurm_jobs/scenic_run_eGRN_onlymut_nofancy.py", line 84, in raise(e) File "/home/mh281129/thesis-proto-pipeline/slurm_jobs/scenic_run_eGRN_onlymut_nofancy.py", line 64, in run_scenicplus( File "/rwthfs/rz/cluster/home/mh281129/git_pulls/scenicplus/src/scenicplus/wrappers/run_scenicplus.py", line 300, in run_scenicplus regulon_specificity_scores(scplus_obj, File "/rwthfs/rz/cluster/home/mh281129/git_pulls/scenicplus/src/scenicplus/RSS.py", line 62, in regulon_specificity_scores rss_values = np.empty(shape=(n_types, n_regulons), dtype=np.float) File "/home/mh281129/miniconda3/envs/scenicplus/lib/python3.8/site-packages/numpy/init.py", line 305, in getattr raise AttributeError(__former_attrs__[attr]) AttributeError: module 'numpy' has no attribute 'float'. np.float was a deprecated alias for the builtin float. To avoid this error in existing code, use float by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use np.float64 here. The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations

Version (please complete the following information): I installed it with the recommended instructions

If the removal of the DAR is the cause for the

ValueError: Sample larger than population or is negative

what can I do to fix that?

SeppeDeWinter commented 8 months ago

Hi @redvoidling

Please see this issue: https://github.com/aertslab/scenicplus/issues/246

Closing the issue here, feel free to respond on the issue linked above.

All the best,

Seppe