Closed jeffdaily closed 2 years ago
Correct me if I'm wrong, but shouldn't the vMaxColumn vector get updated in the 16-bit Lazy_F loop? It gets updated in the 8-bit version, so perhaps this is missing?
diff --git a/src/ssw.c b/src/ssw.c index fe4fb81..0c42b99 100644 --- a/src/ssw.c +++ b/src/ssw.c @@ -470,6 +470,7 @@ static alignment_end* sw_sse2_word (const int8_t* ref, for (j = 0; LIKELY(j < segLen); ++j) { vH = _mm_load_si128(pvHStore + j); vH = _mm_max_epi16(vH, vF); + vMaxColumn = _mm_max_epi16(vMaxColumn, vH); _mm_store_si128(pvHStore + j, vH); vH = _mm_subs_epu16(vH, vGapO); vF = _mm_subs_epu16(vF, vGapE);
Thank you. I've added your fix.
Correct me if I'm wrong, but shouldn't the vMaxColumn vector get updated in the 16-bit Lazy_F loop? It gets updated in the 8-bit version, so perhaps this is missing?