guigolab / bamstats

A command line tool to compute mapping statistics from a BAM file
BSD 3-Clause "New" or "Revised" License
22 stars 4 forks source link

Bamstats

Build Status Coverage Status

Bamstats is a command line tool written in Go for computing mapping statistics from a BAM file.

Installation instructions

Use one of the following methods to install Bamstats.

Install a released version

The easiest way is to download a pre-compiled binary from Github releases. Here is an example for installing the latest released version on Linux 64bit:

export VERSION=0.3.5 OS=linux ARCH=x86_64 BIN=/usr/local/bin
wget -O - https://github.com/guigolab/bamstats/releases/download/v${VERSION}/bamstats-v${VERSION}-${OS}-${ARCH}.tar.gz | tar xz -C ${BIN} bamstats

Install the latest version with go

The following command will install the latest version from the master branch into $GOPATH:

go get github.com/guigolab/bamstats/cmd/bamstats

Provided statistics

Bamstats can currently compute the following mapping statistics:

General

The general mapping statistics include:

If the data is paired-end, a section for read-pairs is also reported. In addition to the above metrics, the section contains a map of the insert size length and the corresponding support as number of reads.

Genome coverage

The genome coverage ststistics are computed for RNA-seq data and include counts for the following genomic regions:

The above metrics are computed for continuous and split mapped reads. An aggregated total is computed across elements and read types too.

The --uniq (or -u) command line flag allows reporting of genome coverage statistics for uniquely mapped reads too.

RNA-seq

The RNA-seq statistics follow IHEC reccomendations for RNA-seq data quality metrics. They include counts for the following regions:

As long as other fractional metrics for the following read types:

Output examples:

Some examples of the program output can be found in the data folder ot this GitHub repository:

Please see here for a complete description of the output fields and how they are calculated.

License

This software is release under a BSD-style license. Please check the LICENSE file for more details.