broadinstitute / pilon

Pilon is an automated genome assembly improvement and variant detection tool
GNU General Public License v2.0
340 stars 60 forks source link

BWA only marks FR reads as proper when mapping mate pairs #41

Closed melop closed 7 years ago

melop commented 7 years ago

Hello,

I mapped my mate pairs with the old BWA algorithm (that is, first producing two sai files for each mate then combined into a sam). The reason is because BWA mem is not recommended for reads <70bp, which is the case for mate pair reads.

However it seems that BWA only sets the "properly paired" flag for the FR orientation, which are actually "improper" for MP libraries. I am wondering whether pilon is affected by this, and whether it will automatically detect the orientation of the library?

In the log is sounded like Pilon identified 100% of the reads to be in the FR direction, and the insert size was not correctly inferred, which doesn't seem to make sense:

  mapped/3kbMP.lib1.merged.bam: 56506620 reads, 0 filtered, 53087740 mapped, 679983 proper, 27036006 stray, FR 100% 2235+/-1392, max 6409 jumps
  mapped/8kbMP.lib1.merged.bam: 73972136 reads, 0 filtered, 69497300 mapped, 816365 proper, 35498534 stray, FR 100% 4431+/-3349, max 14479 jumps
  mapped/12kbMP.lib1.merged.bam: 85458746 reads, 0 filtered, 80026546 mapped, 950794 proper, 40984386 stray, FR 100% 6650+/-6018, max 24702 jumps

Best Regards, Ray

w1bw commented 7 years ago

Sorry this slipped through the cracks. Yes, bwa aln only sets proper pair for FR direction. When I was at the Broad, we used to flip the reads of mate pair libs prior to alignment because of this. It is computing the insert size based only on the "proper pairs", which you can see from above is only ~1% of your mapped reads.