cbg-ethz / V-pipe

V-pipe is a pipeline designed for analysing NGS data of short viral genomes
https://cbg-ethz.github.io/V-pipe/
Apache License 2.0
129 stars 43 forks source link
biohackcovid20 biohackeu20 biohackeu21 biohackeu22 bioinformatics bioinformatics-pipeline conda genomics hiv ngs sars-cov-2 sarscov2 sequencing snakemake virus

Logo

bio.tools Snakemake Deploy Docker image Tests Mega-Linter License: Apache-2.0

V-pipe is a workflow designed for the analysis of next generation sequencing (NGS) data from viral pathogens. It produces a number of results in a curated format (e.g., consensus sequences, SNV calls, local/global haplotypes). V-pipe is written using the Snakemake workflow management system.

Usage

Different ways of initializing V-pipe are presented below. We strongly encourage you to deploy it using the quick install script, as this is our preferred method.

To configure V-pipe refer to the documentation present in config/README.md.

V-pipe expects the input samples to be organized in a two-level directory hierarchy, and the sequencing reads must be provided in a sub-folder named raw_data. Further details can be found on the website. Check the utils subdirectory for mass-importers tools that can assist you in generating this hierarchy.

We provide virus-specific base configuration files which contain handy defaults for, e.g., HIV and SARS-CoV-2. Set the virus in the general section of the configuration file:

general:
  virus_base_config: hiv

Also see snakemake's documentation to learn more about the command-line options available when executing the workflow.

Tutorials

Tutorials for your first steps with V-pipe for different scenarios are available in the docs/ subdirectory.

Using quick install script

To deploy V-pipe, use the installation script with the following parameters:

curl -O 'https://raw.githubusercontent.com/cbg-ethz/V-pipe/master/utils/quick_install.sh'
./quick_install.sh -w work

This script will download and install miniconda, checkout the V-pipe git repository (use -b to specify which branch/tag) and setup a work directory (specified with -w) with an executable script that will execute the workflow:

cd work
# edit config.yaml and provide samples/ directory
./vpipe --jobs 4 --printshellcmds --dry-run

Test data to test your installation is available with the tutorials provided in the docs/ subdirectory.

Using Docker

Note: the docker image is only setup with components to run the workflow for HIV and SARS-CoV-2 virus base configurations. Using V-pipe with other viruses or configurations might require internet connectivity for additional software components.

Create config.yaml or vpipe.config and then populate the samples/ directory. For example, the following config file could be used:

general:
  virus_base_config: hiv

output:
  snv: true
  local: true
  global: false
  visualization: true
  QA: true

Then execute:

docker run --rm -it -v $PWD:/work ghcr.io/cbg-ethz/v-pipe:master --jobs 4 --printshellcmds --dry-run

Using Snakedeploy

First install mamba, then create and activate an environment with Snakemake and Snakedeploy:

mamba create -c conda-forge -c bioconda --name snakemake snakemake snakedeploy
conda activate snakemake

Snakemake's official workflow installer Snakedeploy can now be used:

snakedeploy deploy-workflow https://github.com/cbg-ethz/V-pipe --tag master .
# edit config/config.yaml and provide samples/ directory
snakemake --use-conda --jobs 4 --printshellcmds --dry-run

Dependencies

Computational tools

Other dependencies are managed by using isolated conda environments per rule, and below we list some of the computational tools integrated in V-pipe:

Citation

If you use this software in your research, please cite:

Fuhrmann, L., Jablonski, K. P., Topolsky, I., Batavia, A. A., Borgsmueller, N., Icer Baykal, P., Carrara, M. ... & Beerenwinkel, (2023). "V-Pipe 3.0: A Sustainable Pipeline for Within-Sample Viral Genetic Diversity Estimation." bioRxiv, doi:10.1101/2023.10.16.562462.

Contributions

* software maintainer ; ** group leader

Contact

We encourage users to use the issue tracker. For further enquiries, you can also contact the V-pipe Dev Team v-pipe@bsse.ethz.ch.