BenLangmead / bowtie2

A fast and sensitive gapped read aligner
GNU General Public License v3.0
639 stars 159 forks source link

Minor optimization of SwAligner::alignNucleotidesEnd2EndSse* #404

Closed sfiligoi closed 1 year ago

sfiligoi commented 1 year ago

Minor code optimization of SwAligner::alignNucleotidesEnd2EndSseU8 and SwAligner::alignNucleotidesEnd2EndSseI16. Explicitly load pvScore and re-arrange load order to better account for SSE latency. Also avoid using object globals in inner loop.

sfiligoi commented 1 year ago

I measure about 5% speedup due to this change.

sfiligoi commented 1 year ago

CC @ch4rr0