tommyau / bamclipper

Remove primer sequence from BAM alignments by soft-clipping
MIT License
31 stars 10 forks source link

soft clipping primers from BAM files using sequences #3

Closed PromitaBose closed 7 years ago

PromitaBose commented 7 years ago

Hello Is it possible to soft-clips gene-specific primers from BAM files based on primer sequences in fasta format instead of using genomic coordinates

PromitaBose commented 7 years ago

Additionally I only want to soft clip an aligned read if the primer sequence shows up on the read edges

tommyau commented 7 years ago

Currently BAMClipper soft-clips primer sequences at read edges based on BEDPE locations, which are usually derived from primer sequences in FASTA. Do you mean you have difficulties in preparing a BEDPE file from your primer sequences FASTA?

PromitaBose commented 7 years ago

Many thanks for your response. 1)How do you derive the BEDPE locations from primer sequences in FASTA ? 2) Why use BEDPE locations instead of the sequences ?

tommyau commented 7 years ago

Primer sequences can be mapped to genomic locations using these in silico PCR tools:

Please let me know if you have difficulties in preparing the BEDPE file. And do you think it will be handy if BAMClipper includes some kind of conversion tool to convert primer sequences to BEDPE for you?

Regarding your second question, BEDPE location is preferred for performance reason. Since aligner (e.g. BWA-MEM) matches primer sequence(s) in the NGS reads to reference genome already, BAMClipper soft-clips the primers accordingly without doing the matching again (NGS read versus primer sequences).