Open Shiywa opened 2 years ago
Hi @Shiywa
Is this the case for the entire run or only specific sections?
Best,
Seppe
now, I just run GRN
in my cluster. So, I don't know whether other steps will run like this.
Well, when I was running the CTX
, I found that the step could run parallelly.
singularity run aertslab-pyscenic-0.12.0.sif pyscenic ctx HPV16_CC_output.tsv hg38_500bp_up_100bp_down_full_tx_v10_clust.genes_vs_motifs.rankings.feather hg38_10kbp_up_10kbp_down_full_tx_v10_clust.genes_vs_motifs.rankings.feather --annotations_fname motifs-v10nr_clust-nr.hgnc-m0.001-o0.0.tbl --mode dask_multiprocessing --num_workers 100 --output regulons.csv --expression_mtx_fname HPV16_CC1_count2.csv
however, I set the num_workers
as 100, but the highest number of parallel tasks was 36 in my eyes. Is it normal?
I think what happened is that you happened to monitor the progress of the GRN step at a time where it did not need more than 4 cores (i.e. there were only 4 more tasks to complete so it does not make sense to use more cores in that case). That's why I asked wether it was using 4 cores for the entire time the GRN step was running. Chances are high that it was using more than 4 cores at some other time.
I can not be sure of this however.
Well, I kept monitoring the tasks in step GRN
using top
, but the number of the running tasks was really limited under maybe 6. By top
, i could find a huge amount of tasks python
with the state STOP
, while only a few tasks kept running.
Actually, I want to ask which reason is the limitation of parallelly computing. I found that you mentioned "cores". My cluster has two physical cores, 32 logical cores and 128 threads. Could you please tell me which one is associated with the parameter num_workers
?
Regards!
num_workers
is the number of threads that are spawned.
Thanks for your recent update !
I have a question about the parallel computing about pySCENIC.
Recently, I'm using the singularity to run the image "aertslab-pyscenic-0.12.0.sif". In the first step of
GRN
, I could control the parallel computing by set the--num_workers
. However, when I set the--num_workers
as 24 or other number, I found that only three or more less tasks were running.So, I want to konw the reasons of limitation. Could please give me some advice?
My linux cluster is like
Could you please tell me which characteristics of machine will limit the parallel computing ?