GCC compilation - all procedures (except AVX2-wide) reports the same reference result
naive scalar ... reference result = 8108076510, time = 6.299609 s
std::strstr ... reference result = 8108076510, time = 0.659882 s
SWAR 64-bit (generic) ... reference result = 8108076510, time = 1.446615 s
SWAR 32-bit (generic) ... reference result = 8108076510, time = 2.529733 s
SSE2 (generic) ... reference result = 8108076510, time = 0.498816 s
SSE4.1 (MPSADBW) ... reference result = 8108076510, time = 0.640781 s
SSE4.1 (MPSADBW unrolled) ... reference result = 8108076510, time = 0.961995 s
SSE4.2 (PCMPESTRM) ... reference result = 8108076510, time = 1.373412 s
SSE (naive) ... reference result = 8108076510, time = 1.960058 s
AVX2 (MPSADBW) ... reference result = 8108076510, time = 0.578520 s
AVX2 (generic) ... reference result = 8108076510, time = 0.374598 s
AVX2 (naive) ... reference result = 8108076510, time = 1.147053 s
AVX2 (naive unrolled) ... reference result = 8108076510, time = 0.795070 s
AVX2-wide (naive) ... reference result = 8107771150, time = 0.541654 s
MPSADBW variants in clang compilation have different values:
naive scalar ... reference result = 8108076510, time = 6.293796 s
std::strstr ... reference result = 8108076510, time = 0.660113 s
SWAR 64-bit (generic) ... reference result = 8108076510, time = 1.334720 s
SWAR 32-bit (generic) ... reference result = 8108076510, time = 2.518706 s
SSE2 (generic) ... reference result = 8108076510, time = 0.489896 s
SSE4.1 (MPSADBW) ... reference result = 5713208130, time = 1.787850 s
SSE4.1 (MPSADBW unrolled) ... reference result = 7962617290, time = 0.985689 s
SSE4.2 (PCMPESTRM) ... reference result = 8108076510, time = 1.448608 s
SSE (naive) ... reference result = 8108076510, time = 1.946516 s
AVX2 (MPSADBW) ... reference result = 8108076510, time = 0.694087 s
AVX2 (generic) ... reference result = 8108076510, time = 0.353279 s
AVX2 (naive) ... reference result = 8108076510, time = 1.054814 s
AVX2 (naive unrolled) ... reference result = 8108076510, time = 0.795445 s
AVX2-wide (naive) ... reference result = 8107771150, time = 0.577752 s
GCC compilation - all procedures (except AVX2-wide) reports the same reference result
MPSADBW variants in clang compilation have different values: