tommyau / bamclipper

Remove primer sequence from BAM alignments by soft-clipping
MIT License
31 stars 10 forks source link

-u and -d clarification #11

Closed aledj2 closed 5 years ago

aledj2 commented 5 years ago

Could you please clarify from where the upstream and downstream options extend the clipping.

If upstream goes upstream from the 5' most base of primer, it will extend away from the primer. But if downstream extends downstream from the 5' most base of the primer it will extend into the primer? If these functions did the same thing I would expect the default values to be the same for each?

I also assume that upstream only extends upstream from the first primer? Clipping upstream from the second primer would clip bases in the amplicon?

Do these options work the same given each strand?

I am using a bedpe file so perhaps these options are not relevant in my use case?

tommyau commented 5 years ago

To process each BAM alignment, BAMClipper needs to assign it to a primer by matching the alignment starting position and primer 5' end position. The default parameters -u 1 and -d 5 mean the alignment starting position can be shifted by at most 1 bp upstream and 5bp downstream and still assign to the same primer. After the primer assignment to the alignment, BAMClipper always clips the alignment all the way to the primer 3' end position, regardless of -u and -d. So the non-primer bases in between the primer pair will not be clipped. Moreover, -u and -d are applied to both primers of each amplicon, relative to their own strand. Refering to the example bedpe line in README.md, the 5' position of left primer is chr20:31022897 (1-based coordinate) and right primer is chr20:31023123 (1-based coordinate). Forward alignment from left primer can start at any base at chr20:31022896-31022902. Reverse alignment from right primer can start at any base at chr20:31023118-31023124.

aledj2 commented 5 years ago

That's very helpful. Thank you