statgen / pheweb

A tool to build a website to browse hundreds or thousands of GWAS.
MIT License
158 stars 65 forks source link

use SGE for multiprocessing #125

Closed dl016d closed 3 years ago

dl016d commented 5 years ago

Is it possible to add support for Sun Grid Engine (SGE) cluster scheduler?

pjvandehaar commented 5 years ago

Definitely. See https://github.com/statgen/pheweb#5-load-your-association-files . Converting the pheweb slurm-parse to work on SGE should be trivial. If you make the changes, I'd appreciate it if you'd send them to me and I could incorporate them into PheWeb. If you're looking to use SGE on the rest of the loading steps, it will take a little more work, but not a lot more. I think that effort would be better spent on speeding up the reader and writer for the internal files (perhaps using Pandas), but I'm happy to give you advice on how to add SGE support as well.

Peter

On Wed, Apr 10, 2019 at 11:23 AM dl016d notifications@github.com wrote:

Is it possible to add supoort Sun Grid Engine (SGE) cluster scheduler?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/statgen/pheweb/issues/125, or mute the thread https://github.com/notifications/unsubscribe-auth/AA0niUmqCEMEgtQHPDh4D1bssyHHsKqOks5vfgHkgaJpZM4cnfar .

figueroakl commented 3 years ago

If possible it would be great to have some advice on using SGE on the rest of the loading steps. So far, I have successfully adapted the slurm-parse script to SGE :

!/bin/bash

$ -t 1-20

$ -l h_vmem=4G

$ -l h_rt=24:00:00

$ -pe smp 4

$ -binding linear:4

$ -N parse_taskarrays

$ -cwd

$ -o stdout/logs/

$ -e stderr/logs/

source /broad/software/scripts/useuse use .python-3.8.3

now=$(date)

jobs=($(seq 0 19))

export PHEWEB_DATADIR='/humgen/florezlab/kfiguero/20_pheno_chr22_parallel'

echo "Pheweb parsing is starting at : ${now}" /home/unix/kfiguero/.local/bin/pheweb conf num_procs=4 parse --phenos=${jobs[($SGE_TASK_ID - 1)]} echo "Pheweb parsing has ended at : ${now}"

pjvandehaar commented 3 years ago

@figueroakl On the branch hg38 I added a --phenos=1,3,5-8 option to all of the one-task-per-pheno processing steps. I want to test that branch a little more before I merge it into master and release it, but it all seems to work right now. If you update to that release you'll be able to easily adapt your script to run more of the processing steps.

pjvandehaar commented 3 years ago

I just released version 1.1.28 which allows pheweb cluster --engine=[slurm|sge] --step=[parse|augment-phenos|bgzip-phenos|manhattan|qq].