statgen / pheweb

A tool to build a website to browse hundreds or thousands of GWAS.
MIT License
154 stars 65 forks source link

How to increase simultaneous jobs in `pheweb process --no-parse` #148

Closed Shicheng-Guo closed 3 years ago

Shicheng-Guo commented 3 years ago

Hi Pete,

I notice 18 jobs are running simultaneously in pheweb process --no-parse. I am wondering how to increase this number, for example, can I increase 18 to 30 or 50 to decrease the required overall time.

Thanks.

Shicheng

(base) [sguo2@comet-ln3 pheweb]$ pheweb process --no-parse
==> Starting `pheweb phenolist verify`
The 1306 phenotypes in '/projects/ps-janssen4/dsci-csb/user/sguo2/pheweb/pheno-list.json' look good.
==> Completed in 0 seconds

==> Starting `pheweb sites`
The list of sites is up-to-date!
==> Completed in 1 seconds

==> Starting `pheweb make-gene-aliases-trie`
gene aliases are at '/home/sguo2/.pheweb/cache/gene_aliases-v29-hg19.marisa_trie'
==> Completed in 0 seconds

==> Starting `pheweb add-rsids`
rsid annotation is up-to-date!
==> Completed in 0 seconds

==> Starting `pheweb add-genes`
gene annotation is up-to-date!
==> Completed in 0 seconds

==> Starting `pheweb make-tries`
tries are up-to-date!
==> Completed in 0 seconds

==> Starting `pheweb augment-phenos`
Processing 1059 phenos (247 already done)
Completed    0 tasks in 0 seconds (18 currently in progress, 1059 remain)
Completed   79 tasks in 35 minutes (18 currently in progress, 980 remain)
Completed  147 tasks in 64 minutes (18 currently in progress, 912 remain)
pjvandehaar commented 3 years ago

You can set num_procs = 30 in your config.py file, or you can run pheweb conf num_procs=50 process --no-parse. You can also set num_procs = {'qq': 50, '*': 20} to use different numbers of processes for different steps.

Using more processes than you have cores on your CPU might be slower instead of faster, but if you experiment with it I'd like to hear what you find.

Shicheng-Guo commented 3 years ago

Hi Peter, I have 128 cores per node. Do you think I can use 2 nodes (256 cores) so that I can set num_procs = 256 ?

Standard Compute Nodes (728 total). AMD EPYC 7742 (Rome) Compute Nodes; 2.25 GHz; 128 cores per node; 1 TB NVMe per node; 256 GB DRAM per node

Thanks.

Shicheng

You can set num_procs = 30 in your config.py file, or you can run pheweb conf num_procs=50 process --no-parse. You can also set num_procs = {'qq': 50, '*': 20} to use different numbers of processes for different steps.

Using more processes than you have cores on your CPU might be slower instead of faster, but if you experiment with it I'd like to hear what you find.