hsinnan75 / MapCaller

MapCaller – An efficient and versatile approach for short-read alignment and variant detection in high-throughput sequenced genomes
MIT License
29 stars 5 forks source link

Difference betweek -alg ksw2 and nw #31

Closed tseemann closed 4 years ago

tseemann commented 4 years ago

What is the difference?

   -alg STR      gapped alignment algorithm (option: nw|ksw2)

nw = Needleman-Wunsch = glocal alignment = whole read must align? ksw2 = Smith-Waterman = local alignment = any substring of read must align?

hsinnan75 commented 4 years ago

nw = Needleman-Wunsch, ksw2 = Suzuki-Kasahara algorithm (I modified the source codes sw2_extz2_sse.c from https://github.com/lh3/ksw2). In fact, the two algorithm produce identical alignments for most of the time. They only differ in longer sequence fragment alignments with indel events. They would produce different indels at different positions.

tseemann commented 4 years ago

Are both SSE4 accelerated, or just ksw2 ?

hsinnan75 commented 4 years ago

According to the document, I think it is both. I quote below.

[It provides implementations using SSE2 and SSE4.1 intrinsics based on Hajime Suzuki's diagonal formulation which enables 16-way SSE parallelization for the most part of the inner loop, regardless of the maximum score of the alignment.]

The alignment is boosted both on the diagonal formulation and the 16-way SSE parallelization. My another project (GSAlign) also uses ksw algorithm for genome sequence alignments and it is very efficient, much faster than NW algorithm.