Bamstats
is a command line tool written in Go
for computing mapping statistics from a BAM
file.
Use one of the following methods to install Bamstats
.
The easiest way is to download a pre-compiled binary from Github releases. Here is an example for installing the latest released version on Linux 64bit:
export VERSION=0.3.5 OS=linux ARCH=x86_64 BIN=/usr/local/bin
wget -O - https://github.com/guigolab/bamstats/releases/download/v${VERSION}/bamstats-v${VERSION}-${OS}-${ARCH}.tar.gz | tar xz -C ${BIN} bamstats
The following command will install the latest version from the master branch into $GOPATH
:
go get github.com/guigolab/bamstats/cmd/bamstats
Bamstats
can currently compute the following mapping statistics:
The general mapping statistics include:
NH
tag in BAM
file)If the data is paired-end, a section for read-pairs is also reported. In addition to the above metrics, the section contains a map of the insert size length and the corresponding support as number of reads.
The genome coverage ststistics are computed for RNA-seq data and include counts for the following genomic regions:
The above metrics are computed for continuous and split mapped reads. An aggregated total is computed across elements and read types too.
The --uniq
(or -u
) command line flag allows reporting of genome coverage statistics for uniquely mapped reads too.
The RNA-seq statistics follow IHEC reccomendations for RNA-seq data quality metrics. They include counts for the following regions:
rRNA
)As long as other fractional metrics for the following read types:
Some examples of the program output can be found in the data
folder ot this GitHub repository:
coverageUniq
stats are reported as an additional JSON object)Please see here for a complete description of the output fields and how they are calculated.
This software is release under a BSD-style license. Please check the LICENSE
file for more details.