AndersenLab / VCF-kit

VCF-kit: Assorted utilities for the variant call format
http://www.andersenlab.org
MIT License
122 stars 25 forks source link
python vcf

Build Status Coverage Status Documentation Status

VCF-kit - Documentation

VCF-kit is a command-line based collection of utilities for performing analysis on Variant Call Format (VCF) files. A summary of the commands is provided below.

Command Description
calc Obtain frequency/count of genotypes and alleles.
call Compare variants identified from sequences obtained through alternative methods against a VCF.
filter Filter variants with a minimum or maximum number of REF, HET, ALT, or missing calls.
geno Various operations at the genotype level.
genome Reference genome processing and management.
hmm Hidden-markov model for use in imputing genotypes from parental genotypes in linkage studies.
phylo Generate dendrograms from a VCF.
primer Generate primers for variant validation.
rename Add a prefix, suffix, or substitute a string in sample names.
tajima Calculate Tajima’s D.
vcf2tsv Convert a VCF to TSV.

Installation

VCF-Kit has been upgraded to Python 3

VCF-kit has been tested with Python 3.6. VCF-kit makes use of additional software for a variety of tasks:

You can install these dependencies and VCF-kit using conda, or you can use a Docker image.

Conda

conda config --add channels bioconda
conda config --add channels conda-forge
conda create -n vcf-kit \
  danielecook::vcf-kit=0.2.6 \
  "bwa>=0.7.17" \
  "samtools>=1.10" \
  "bcftools>=1.10" \
  "blast>=2.2.31" \
  "muscle>=3.8.31" \
  "primer3>=2.5.0"

conda activate vcf-kit

Docker

You can also run VCF-kit with all installed dependencies using docker:

docker run -it andersenlab/vcf-kit vk