Remove gene-specific primer sequences from SAM/BAM alignments of PCR amplicons by soft-clipping
Download latest version in a ZIP package
bamclipper.sh
soft-clips gene-specific primers from BAM alignment file based on genomic coordinates of primer pairs in BEDPE format.
./bamclipper.sh -b BAM -p BEDPE [-n NTHREAD] [-s SAMTOOLS] [-g GNUPARALLEL] [-u UPSTREAM] [-d DOWNSTREAM]
Given a BAM file called NAME.bam, a new BAM file (NAME.primerclipped.bam) and its associated index (NAME.primerclipped.bam.bai) will be generated in the current working directory.
Notes: For the sake of performance and simplicity, soft-clipping is performed solely based on genomic coordinates without involving the underlying sequence. Reference sequence names and coordinates of BAM and BEDPE are assumed to be derived from identical reference sequences (e.g. hg19).
Required arguments
Options
# Clip primers by BAMClipper
>./bamclipper.sh -b examples/SRR2075598.bam -p examples/trusight_myeloid.bedpe -n 4
# done!
# SRR2075598.primerclipped.bam and its index SRR2075598.primerclipped.bam.bai are generated.
# the new SRR2075598.primerclipped.bam should be identical to the provided example (compare checksum of alignments and ignore headers)
>samtools view SRR2075598.primerclipped.bam | md5sum
6a431457fd6e892646c17d1c3029c24e -
>samtools view examples/SRR2075598.primerclipped.bam | md5sum
6a431457fd6e892646c17d1c3029c24e -
# An example line of primer pair BEDPE file (an amplicon targeting ASXL1)
>grep 31022896 examples/trusight_myeloid.bedpe
chr20 31022896 31022921 chr20 31023096 31023123
Details of demo data:
Au CH, Ho DN, Kwong A, Chan TL and Ma ESK, 2017. BAMClipper: removing primers from alignments to minimize false-negative mutations in amplicon next-generation sequencing. Scientific Reports 7:1567 (doi:10.1038/s41598-017-01703-6)