lh3 / bwa

Burrow-Wheeler Aligner for short-read alignment (see minimap2 for long-read alignment)
GNU General Public License v3.0
1.55k stars 556 forks source link

bwa mem 20% speedup with vector instructions (Fixed intel-extend branch) #226

Open zamaudio opened 6 years ago

zamaudio commented 6 years ago

Using the -F flag that this patch provides, bwa mem finishes in 80% of the time using SSE4, (more speedup with AVX)

zamaudio commented 6 years ago

Time taken for a whole exome on various versions of BWA

bwa 0.7.13 0.7.13+SSE4 (with flag -f) master 0.7.17+SSE4 (no flag) 0.7.17+SSE4 (with flag -F)
time 27m57.428s 20m20.202s 25m1.158s 23m39.184s 20m1.847s