mskcc / htstools

5 stars 3 forks source link

htstools

Contains three tools (dnafrags, ppflag-fixer, snp-pileup) written by Alex Studer to process bam files for downstream copy number analysis.

Installation

First, HTSlib must be installed on your system. To do that, download it and follow the "Building and installing" instructions on that page.

Then, download this code, extract it, cd to where you extracted it, and run the following:

g++ -std=c++11 snp-pileup.cpp -lhts -o snp-pileup     # for snp-pileup

when htslib is available systemwide, or

g++ -std=c++11 -I/path/htslib/include snp-pileup.cpp -L/path/htslib/lib -lhts -Wl,-rpath=/path/htslib/lib -o snp-pileup 

when it is installed locally and path is the location where it is available. The other two tools ppflag-fixer and dnafrags can be compiled likewise.

snp-pileup

This application will, given a VCF file containing SNP locations, output for each SNP the counts of the reference nucleotide, alternative nucleotide, errors, and deletions.

Usage

snp-pileup <vcf file> <output file> <sequence files...>

Usage of snp-pileup requires a VCF file and one (or multiple) sequence files containing DNA. The sequence files should be in the BAM format, and both the VCF and all sequence files must be sorted.

Parameters

Here is a list of all parameters snp-pileup accepts and information about what they do. Some of them, such as -q, -Q, -A, and -x, are the same as their equivalent in samtools mpileup, and are used the same way.

You can view this list at any time by using --help.

Limitations

SNPs where there are multiple nucleotides changing will be ignored, and all minimum thresholds (except for the minimum read count) apply equally to all files—there is no way to set them on a per-file basis.