Closed sfiligoi closed 1 year ago
Minor code optimization of SwAligner::alignNucleotidesEnd2EndSseU8 and SwAligner::alignNucleotidesEnd2EndSseI16. Explicitly load pvScore and re-arrange load order to better account for SSE latency. Also avoid using object globals in inner loop.
I measure about 5% speedup due to this change.
CC @ch4rr0
Minor code optimization of SwAligner::alignNucleotidesEnd2EndSseU8 and SwAligner::alignNucleotidesEnd2EndSseI16. Explicitly load pvScore and re-arrange load order to better account for SSE latency. Also avoid using object globals in inner loop.