brentp / bwa-meth

fast and accurate alignment of BS-Seq reads using bwa-mem and a 3-letter genome
https://arxiv.org/abs/1401.1129
MIT License
144 stars 54 forks source link

Mapq #55

Open nchernia opened 6 years ago

nchernia commented 6 years ago

I’ve noticed that running bwa meth results in a higher rate of MAPQ 0 reads than vanilla bwa (I did a test on non-bisulfite converted data). Is there any way to ameliorate this or is it just a natural consequence of needing to distinguish between methylated and nonmethylated loci?

brentp commented 6 years ago

this is the only place it's adjust mapq and it's setting to 1. mapq 0 means it maps to multiple places equally which is more likely to occur with a simplified reference.

nchernia commented 6 years ago

Yes, I was wondering on a more fundamental level; perhaps via the flags bwa-meth uses to call bwa. I asked Heng Li and he suggested asking here.

BTW, what's the motivation behind that heuristic? Those are reads that bwa deems good enough to map, is there a reason more fundamental to bisulfite alignment that you would label them failing QC?

brentp commented 6 years ago

if you can show that it performs better without that, I'll remove it. I've actually found (for non-BS-Seq data) that alignments/ regions with high NM are often bad as well.

nchernia commented 6 years ago

I was thinking it should be an optional flag to turn it off (was eventually going to code it and make a pull request). We expect chimeric reads in our experiment, but if one was not expecting chimeras, I could see it being useful.

brentp commented 6 years ago

still same criteria. if you can show that it is more accurate without, it can be removed. I'm hesitant to add more command-line flags.