Closed liboxun closed 4 years ago
Here's the output I got so far:
2020-05-06 10:42:51,713 - pyscenic.cli.pyscenic - INFO - Creating modules.
2020-05-06 10:42:54,498 - pyscenic.cli.pyscenic - INFO - Loading expression matrix.
2020-05-06 10:43:00,178 - pyscenic.utils - INFO - Calculating Pearson correlations.
2020-05-06 10:43:00,178 - pyscenic.utils - WARNING - Note on correlation calculation: the default behaviour for calculating the correlations has changed after pySCENIC verion 0.9.16. Previously, the default was to calculate the correlation between a TF and target gene using only cells with non-zero expression values (mask_dropouts=True). The current default is now to use all cells to match the behavior of the R verision of SCENIC. The original settings can be retained by setting 'rho_mask_dropouts=True' in the modules_from_adjacencies function, or '--mask_dropouts' from the CLI. Dropout masking is currently set to [True]. /home2/s418610/.conda/envs/py37_res_GRN/lib/python3.7/site-packages/pyscenic/utils.py:138: RuntimeWarning: invalid value encountered in greater regulations = (rhos > rho_threshold).astype(int) - (rhos < -rho_threshold).astype(int) /home2/s418610/.conda/envs/py37_res_GRN/lib/python3.7/site-packages/pyscenic/utils.py:138: RuntimeWarning: invalid value encountered in less regulations = (rhos > rho_threshold).astype(int) - (rhos < -rho_threshold).astype(int)
2020-05-06 10:43:29,853 - pyscenic.utils - INFO - Creating modules.
2020-05-06 10:45:26,430 - pyscenic.cli.pyscenic - INFO - Loading databases.
2020-05-06 10:45:26,434 - pyscenic.cli.pyscenic - INFO - Calculating regulons.
And it's been running 'Calculating regulons' since then.
Hi @liboxun ,
It should not take 2+ days to run this step. Depending on the number of processes used, I'd expect it to complete in under an hour at worst. I would suggest maybe stopping the process, and re-starting it. Also, are you using the same database files as in the tutorial?
Hi @cflerin ,
Thanks for the quick reply! Good to know.
I've submitted multiple jobs (with the same script), and it never ended within a day. I use 32 processes, as it's the number of cores of the HPC computer I use. Therefore re-starting seems not to solve the problem.
I believe I'm using the same databases as in the tutorial. Quoting the PBMC10k_SCENIC-protocol-CLI.ipynb:
ranking databases
f_db_glob = "/ddn1/vol1/staging/leuven/res_00001/databases/cistarget/databases/homo_sapiens/hg38/refseq_r80/mc9nr/gene_based/*feather" f_db_names = ' '.join( glob.glob(f_db_glob) )
motif databases
f_motif_path = "/ddn1/vol1/staging/leuven/res_00001/databases/cistarget/motif2tf/motifs-v9-nr.hgnc-m0.001-o0.0.tbl"
In comparison, the databases I'm using are downloaded from:
https://resources.aertslab.org/cistarget/motif2tf/motifs-v9-nr.hgnc-m0.001-o0.0.tbl
To me it seems they match up.
Hi @liboxun , I think we solved your issue in the pySCENIC issue tracker, but for anyone else having the same issue, I'll leave this link to a list of recommendations that could potentially solve this: https://github.com/aertslab/pySCENIC/issues/142#issuecomment-625982886
Hi @cflerin ,
Yes, and thank you!
As a reference for anybody that might be having the same issue: for me personally running a Singularity image of pySCENIC instead of the CLI solved the problem.
I'm trying to run the PBMC tutorial Jupyter notebook (PBMC10k_SCENIC-protocol-CLI.ipynb).
It's taking some time to run
pyscenic ctx
. Right now it's been two days and it's still running. I'm running it with an on-campus HPC service. I'm starting to think maybe there's something that I overlooked.How long should it typically take to run
pyscenic ctx
for the PBMC example?Thanks in advance!
Boxun