telatin / bamtocov

🏔 coverage extraction from BAM/CRAM files, supporting targets 📊  
https://telatin.github.io/bamtocov/
MIT License
59 stars 6 forks source link

How did bamtocov deal with paired-end bam from stranded libraries? #9

Open kerenzhou062 opened 2 years ago

kerenzhou062 commented 2 years ago

@telatin Dear Dr. Telatin,

I'm curious about how bamtocov deal with paired-end bam from stranded libraries by setting with --stranded.

Will bamtocov automatically treat mated reads as reverse strand? For example, if mate1 of a fragment was recorded as '-' strand in bam, which means that mate1 are actually comes from '+' strand. When setting with --stranded, mate1 will be counted as '+' strand. My question is, will mate2 of this fragment be automatically counted as '+' strand as well in this case?

Also, will the new '--extendReads INT' parameter also work well with paired-end bam?

Best,

Keren

telatin commented 2 years ago

Generally speaking, the algorithm counts each read with the strand it is reported in the BAM file. There's no general reason to do otherwise, as some pair-ends might behave differently.

An approach I would advice could be to focus on either the first or the second pair, for example with the -F flag you can exclude the second pair from the analysis.

In bamtocounts 2.7 (https://telatin.github.io/bamtocov/tools/bamtocounts.html), for example, I added the --paired mode that counts fragments rather than reads (thus focusing on the first read strand), but in BamToCov this would not be the general use.


As for the second question, --extendReads is being updated (but still experimental) and I will be back to you with the next release :)