ccmbioinfo / crg2

Research pipeline for exploring clinically relevant genomic variants
Apache License 2.0
16 stars 5 forks source link

Slurm profile for CHEO-RI and Stager integration #136

Closed Madelinehazel closed 1 year ago

Madelinehazel commented 2 years ago

This branch consolidates the slurm branch that we've been using to run crg2 for exomes on the CHEO-RI HPC4Health tenancy with the master branch. Master will then have two config files (config_hpf.yaml, config_cheo_ri.yaml) with filepaths specific to the respective HPC4Health tenancy filesystem, as well as the following job scripts tailored to the respective job scheduler/HPC4Health tenancy:

Job scheduler: PBS on SickKids HPC4Health tenancy (hpf)

Job scheduler: Slurm

dnaseq_slurm_api.sh is called by Stager when exome analyses are requested. It automatically sets up (exome_setup_stager.py) and kicks off the crg2 WES pipeline via the slurm API using linked files that have been uploaded to MinIO.

kevinlul commented 2 years ago

What's the purpose of the remaining dnaseq*.sh scripts?

Madelinehazel commented 2 years ago

Not sure what editor was used to write the shell scripts, but they should have trailing newlines like every other text file to be POSIX-compliant.

VSCode :/

Madelinehazel commented 2 years ago

What's the purpose of the remaining dnaseq*.sh scripts?

dnaseq.sh: the very first script we wrote for running crg2 on the CHEO-RI space, without a slurm profile dnaseq_slurm.sh: this is the original script for running crg2 on the CHEO-RI space with slurm integration and sans automation

Neither script incorporated automated setup of the analysis directory.

Won't need either of these in the future...

Madelinehazel commented 2 years ago

@kevinlul input is now a JSON, see a595e76

kevinlul commented 2 years ago

If dnaseq_slurm_api.sh is the script to be run, make sure to chmod +x it!

Madelinehazel commented 2 years ago

TO DO: