An alignment and variant-calling pipeline for Illumina deep sequencing of HIV-1, based on the probabilistic aligner HMMER.
hivmmer-filter
).
The number of duplicates are tracked to enable correct inference of frequencies
later in the pipeline.hivmmer-translate
).hivmmer-codons
).hivmmer --id ID --fq1 FASTQ1 --fq2 FASTQ2 --ref REFERENCE [--cpu N]
[-h|--help] [-v|--version]
ID
specifies a name for the analysis that will be used as the basename for
all output.
FASTQ1
and FASTQ2
are the forward and reverse Illumina reads.
Optionally, you can use N
threads to speed-up the HMMER stages of the pipeline.
hivmmer requires Python 3.7
On 64-bit Linux, it is also possible to install hivmmer using prebuilt packages from the kantorlab Anaconda channel.
First, install the Anaconda or Miniconda distribution of Python 3.
Once the conda
command is in your PATH, hivmmer and all its dependencies can
be installed into its own isolated conda environment with the single command:
conda create -c kantorlab -n hivmmer hivmmer
Once installed, activate the hivmmer
conda environment with:
source activate hivmmer
This will place hivmmer and all its dependencies in your PATH.
We have primarily tested hivmmer on CentOS 6.8, but in theory it should run on any 64-bit Linux system with glibc >= 2.12.
All relevant conda recipes are available from the Kantor Lab's conda-recipes repository.
On systems other than 64-bit Linux, you can run hivmmer via a Docker container.
First, visit the Docker website to download and install Docker for your host operating system.
Second, pull the pre-compiled hivmmer Docker image, which includes all dependencies, from DockerHub:
docker pull kantorlab/hivmmer
Each time you want to use Agalma, run the docker image with:
docker run -it kantorlab/hivmmer
This will launch a new Docker container with hivmmer, and provide an interactive prompt to access to the container.
hivmmer can be installed with pip using the included setup.py, and has the following dependencies on external tools (which must be in your PATH):
Note: PEAR source code is available under an academic license from https://www.h-its.org/en/research/sco/software/#NextGenerationSequencingSequenceAnalysis.
hivmmer comes with prepackaged amino acid profile Hidden Markov Models for the entire HIV genome, based on curated multiple-sequence alignments downloaded from the Los Alamos HIV Sequence Database.
Development of hivmmer is made possible through funding from the National Institutes of Health under awards R01AI108441, R01AI120792, R01AI147333, and the Providence/Boston Center for AIDS Research (P30AI042853).
Mark Howison (mhowison@ripl.org)
For bug reports and questions, please create an issue on Github.
Copyright 2018, Brown University, Providence, RI. All Rights Reserved.
Copyright 2019-2020, Innovative Policy Lab (d/b/a Research Improving People's Lives), Providence, RI. All Rights Reserved.
See LICENSE for full terms of use.