AlfredUg / QuasiFlow

A Nextflow Pipeline for Analysis of NGS-based HIV Drug Resistance Data
GNU General Public License v3.0
6 stars 5 forks source link

QuasiFlow

Twitter Follow

Introduction

QuasiFlow is a nextflow pipeline for reproducible analysis of NGS-based HIVDR testing data across different computing environments. The pipeline takes raw sequence reads in FASTQ format as input, performs quality control, mapping of reads to a reference genome, variant calling, querying the database for detection of HIV drug resistance mutations, and ultimately generates a user-friendly report in PDF and HTML format. QuasiFlow is publicly available at https://github.com/AlfredUg/QuasiFlow.

Installation

QuasiFlow requires nextflow (version 21.04.3 or higher) and any of conda/singularity/docker. In this walk through, we shall demonstrate the use of conda which is more readily available to most users.

The first option is to install the pipeline using nextflow, it will be installed in the $HOME directory under the .nextflow sub-directory. Confirm that installation was successful by printing out the help message.

nextflow pull AlfredUg/QuasiFlow
nextflow run ~/.nextflow/assets/AlfredUg/QuasiFlow --help

Alternatively, simply clone the pipeline repository into a desired directory. Similarly, confirm that installation was successful by printing out the help message.

git clone https://github.com/AlfredUg/QuasiFlow.git
nextflow run QuasiFlow --help

Usage

The pipeline takes as input paired-end illumina data in FASTQ format. Let's download some test data from the European Nucleotide Archive (ENA) using wget command and decompress it using the gunzip command. This is paired-end data from a single sample of bioProject PRJDB3502.

wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/DRR030/DRR030218/DRR030218_1.fastq.gz 
wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/DRR030/DRR030218/DRR030218_2.fastq.gz 
gunzip DRR030218*.gz

Run QuasiFlow on a test dataset with default parameters under the conda profile. This option does not require prior installation since it automatically pulls the pipeline from main branch of the pipeline repository on github. In addition, it installs all the dependancies in a conda environment. If you already installed the pipeline using the procedure above, see next options.

nextflow run AlfredUg/QuasiFlow -r main --reads "$PWD/*_{1,2}.fastq" -profile conda

If you pulled/installed the pipeline using nextflow, simply point to the installation path as follows;

nextflow run ~/.nextflow/assets/AlfredUg/QuasiFlow --reads "$PWD/*_{1,2}.fastq" -profile conda

Similarly, if you already cloned the pipeline repository, simply point to the installation path as follows;

nextflow run path/to/QuasiFlow --reads "$PWD/*_{1,2}.fastq" -profile conda

Profiles

Quasiflow can be run under different computing environments, simply choose an appropriate profile via the -profile argument. Could take any of the following -profile conda, singularity, docker. Custom profiles can be added to the conf directory using any of the available profiles as a template.

Pipeline

Outputs Quality control

Variants and drug resistance outputs

Pipeline information output

Note: Nextflow throws the following warning on MacOS, WARN: Task runtime metrics are not reported when using macOS without a container engine.

Parameters

HyDRA parameters

Mandatory parameters

Optional parameters

Sierralocal parameters

Optional parameters

Output parameters

Optional parameters

Dependancies.

Below is the list of tools that are used in the QuasiFlow pipeline. These tools are readliy available and may be installed using conda via bioconda channel.

Troubleshooting

Kindly report any issues at https://github.com/AlfredUg/QuasiFlow/issues.

License

QuasiFlow is licensed under GNU GPL v3.

Citation

Ssekagiri A, Jjingo D, Lujumba I, Bbosa N, Bugembe DL, Kateete DP, Jordan IK, Kaleebu P, Ssemwanga D. QuasiFlow: a Nextflow pipeline for analysis of NGS-based HIV-1 drug resistance data. Bioinform Adv. 2022 Nov 28;2(1):vbac089. doi: 10.1093/bioadv/vbac089. PMID: 36699347; PMCID: PMC9722223.