WojciechMula / sse4-strstr

SIMD (SWAR/SSE/SSE4/AVX2/AVX512F/ARM Neon) of Karp-Rabin algorithm's modification
http://0x80.pl/articles/simd-strfind.html
BSD 2-Clause "Simplified" License
239 stars 29 forks source link

Present Day Performance Advantage #12

Open victorstewart opened 4 years ago

victorstewart commented 4 years ago

if i read your latest benchmarks correctly...

https://github.com/WojciechMula/sse4-strstr/blob/master/results/cascadelake-Gold-5217-gcc-7.4.0-avx512bw.txt

...nowadays there's no advantage to seeking std::strstr alternatives unless you're using AVX512F or AVX512BW?

Because I assume all the standard libraries have been made AVX2 aware? I poked around in glibc earlier and could see AVX2 usage for memcmp, memcpy etc but not strstr.. only SSE4.2. maybe the SSE4.2 handwritten assembly is just so good?

WojciechMula commented 3 years ago

Didn't check the recent versions of libc from gcc nor clang, but I guess they haven't speed things up significantly. And from what I can gather their strstr never used SIMD instructions. But it's not a problem, because the standard procedures have decent performance. Please bear in mind that SSE4.2 string instructions are rather slow.

So, if you are seeking for better performance, AVX2 won't help much.