compbiocore / covid19_analysis

MIT License
0 stars 0 forks source link

Covid19 Analysis Pipeline

Directory Structure

Running Pipeline via Oscar Slurm Batch Submission

To run the covid pipeline, navigate to /PATH/TO/CLONED/REPO/covid19_analysis/1_scripts/ and run:

sbatch run_slurm.sh /ABSOLUTE/PATH/TO/SEQUENCE/DATA/covid_sequences.fasta

Results will be produced in /covid19_analysis/3_results/${YYYYMMDD}

A run with ~20,000 input sequences takes roughly 30 minutes to complete the primary pangolin analyses and produce figures on Oscar with 24 threads and 128G RAM allocated, however the IQ-tree analysis will run for several days. If incomplete, IQ-tree uses checkpoints and therefore the analysis can be continued beyond the allocated time, if necessary.

Running Pipeline via Oscar Interactive Session

To run thie pipeline in an interact session, first enter a screen screen -S JOBNAME and then initiate an interact session with enough resources (interact -t 24:00:00 -n 24 -m 128G)

Navigate to the 1_scripts directory:

cd /PATH/TO/CLONED/REPO/covid19_analysis/1_scripts

Enter the singularity container and mount the parent directory:

singularity exec -B /ABSOLUTE/PATH/TO/CLONED/REPO/covid19_analysis/ /PATH/TO/CLONED/REPO/covid19_analysis/1_scripts/covid19.sif bash 

Once inside the container, run:

bash run.sh /ABSOLUTE/PATH/TO/SEQUENCE/DATA/covid_sequences.fasta

To leave the screen use ctl + a + d and to return use screen -r JOBNAME

Results will be produced in /PATH/TO/CLONED/REPO/covid19_analysis/3_results/${YYYYMMDD}

Example Usage

sbatch /PATH/TO/CLONED/REPO/covid19_analysis/1_scripts/run_slurm.sh /PATH/TO/CLONED/REPO/covid19_analysis/0_data/sequenceData.fasta

CBC Project Information

title: Covid19 analysis pipeline
tags:
analysts:
git_repo_url: https://github.com/compbiocore/covid19_analysis
resources_used: Pangolin, Nextclade, Nextalign, IQ-Tree, R
summary: 
project_id: