Slurm profile for CHEO-RI and Stager integration

Madelinehazel commented 2 years ago

This branch consolidates the slurm branch that we've been using to run crg2 for exomes on the CHEO-RI HPC4Health tenancy with the master branch. Master will then have two config files (config_hpf.yaml, config_cheo_ri.yaml) with filepaths specific to the respective HPC4Health tenancy filesystem, as well as the following job scripts tailored to the respective job scheduler/HPC4Health tenancy:

Job scheduler: PBS on SickKids HPC4Health tenancy (hpf)

Serialized jobs: qsub dnaseq.pbs
Parallelized jobs: qsub dnaseq_cluster.pbs

Job scheduler: Slurm

Parallelized jobs on SickKids tenancy: sbatch dnaseq_slurm_hpf.sh
Parallelized jobs on CHEO-RI tenancy: sbatch dnaseq_slurm_cheo_ri.sh

dnaseq_slurm_api.sh is called by Stager when exome analyses are requested. It automatically sets up (exome_setup_stager.py) and kicks off the crg2 WES pipeline via the slurm API using linked files that have been uploaded to MinIO.

kevinlul commented 2 years ago

What's the purpose of the remaining dnaseq*.sh scripts?

Madelinehazel commented 2 years ago

Not sure what editor was used to write the shell scripts, but they should have trailing newlines like every other text file to be POSIX-compliant.

VSCode :/

Madelinehazel commented 2 years ago

What's the purpose of the remaining dnaseq*.sh scripts?

dnaseq.sh: the very first script we wrote for running crg2 on the CHEO-RI space, without a slurm profile dnaseq_slurm.sh: this is the original script for running crg2 on the CHEO-RI space with slurm integration and sans automation

Neither script incorporated automated setup of the analysis directory.

Won't need either of these in the future...

Madelinehazel commented 2 years ago

@kevinlul input is now a JSON, see a595e76

kevinlul commented 2 years ago

If dnaseq_slurm_api.sh is the script to be run, make sure to chmod +x it!

Madelinehazel commented 2 years ago

TO DO:

[x] create two benchmark.tsv (one for hpf, one for CHEO-RI)
[x] remove config.yaml, add config_hpf.yaml
[x] remove dnaseq.sh
[x] modify dnaseq_slurm_api.sh: add --configfile argument
[x] remove dnaseq_slurm_api_2910.sh
[x] modify slurm_profile/settings.json default account

ccmbioinfo / crg2

Slurm profile for CHEO-RI and Stager integration #136