hsinnan75 / MapCaller

MapCaller – An efficient and versatile approach for short-read alignment and variant detection in high-throughput sequenced genomes
MIT License
29 stars 5 forks source link

Alignment in homopolymer regions #61

Open tseemann opened 4 years ago

tseemann commented 4 years ago

This alignment is problematic.

image

Are you able to ensure "left-aligned" alignments?

hsinnan75 commented 4 years ago

I am not sure how to do that, but I'll do my best. Thank you!

tseemann commented 4 years ago

Freebayes used to come with this tool:

bamleftalign -d -c -f ref.fasta < sorted.bam > sorted.aligned.bam

image

GATK also has a tool: https://gatk.broadinstitute.org/hc/en-us/articles/360036508992-LeftAlignIndels

But ideally MapCaller would do this. if you don't you will miss some indels.

hsinnan75 commented 4 years ago

Thank you for the information. I did observe the inconsistent alignments in homopolymer regions, but I did not know how to solve this issue. Your information is very helpful. I'll look into it.

tseemann commented 4 years ago

The problem is that this does not really solve it. It is impossible to know which A was actually deleted (or where an A was inserted) It could be any of the 6 positions. The left align procedure just tries to make it consistent.

hsinnan75 commented 4 years ago

I found that ksw2 would generate more "left-aligned" alignments. You may run MapCaller with "-alg ksw2". Thank you!

tseemann commented 4 years ago

it does appear that nw does some bad alignments around indels and ksw2 does better

-alg nw image

-alg ksw2 image