ekg / bamaddrg

adds sample names and read-group (RG) tags to BAM alignments
MIT License
52 stars 13 forks source link

bamaddrg: adds read groups to input BAM files, streams BAM output on stdout.

This is intended for use "fixing up" RG tags on the fly so that they reflect the source file from which the aligment originated from. This allows the "safe" merging of many files from many individuals into one stream, suitable for input into downstream processing systems such as freebayes (a population variant detector).

Usage:

To tag multiple files simultaneously, use as such:

% bamaddrg -b file1.bam -s jill -r group.s1.1 \
           -b file2.bam -s jill -r group.s1.2 \
           -b file3.bam -s bill \
           -b file4.bam \
   | freebayes ...

This would add the sample name "jill" to records from file1.bam and file2.bam, while specifying that specific read group identify alignments from each file. Alignments from file3.bam would be tagged with sample name "bill" and read group id "bill", and samples from file4.bam would be tagged with sample name "file4.bam" and read group id "file4.bam."