microsoft / STL

MSVC's implementation of the C++ Standard Library.
Other
9.88k stars 1.45k forks source link

Vectorize `std::search` of 1 and 2 bytes elements with `pcmpestr*` #4745

Open AlexGuteniev opened 4 days ago

AlexGuteniev commented 4 days ago

Different approach for both search and inner comparison (SSE4.2 instead of AVX2). This time the results are better.

For now 1 and 2 bytes element only. The same slightly modified approach can be used for 4 and 8 bytes elements, but need to test if there would be still a performance gain.

In benchmark results 0 is small needle, 1 is large needle.

Benchmark main ths
c_strstr/0 186 ns 184 ns
c_strstr/1 213 ns 213 ns
classic_search/0 2045 ns 270 ns
classic_search/1 2221 ns 302 ns
classic_search/0 1588 ns 531 ns
classic_search/1 1766 ns 586 ns
ranges_search/0 1748 ns 268 ns
ranges_search/1 1989 ns 306 ns
ranges_search/0 1673 ns 585 ns
ranges_search/1 1843 ns 600 ns
search_default_searcher/0 1494 ns 269 ns
search_default_searcher/1 1626 ns 309 ns
search_default_searcher/0 2002 ns 528 ns
search_default_searcher/1 2286 ns 599 ns
AlexGuteniev commented 4 days ago

The previous attempt was #4654 and it ended up being just memcmp removal; see https://github.com/microsoft/STL/pull/4654#issuecomment-2159504811

StephanTLavavej commented 4 days ago

:warning: Note to self, check the benchmarks on my machine.

AlexGuteniev commented 3 days ago

Large needles are expected to be not much worse performance, but there would be a different branch with a somewhat different approach. I can do them in this PR and not subsequent PR, if you prefer that.

StephanTLavavej commented 1 day ago

This PR isn't too massive, I think it would make sense to add large needle logic here.

AlexGuteniev commented 15 hours ago

This PR isn't too massive, I think it would make sense to add large needle logic here.

Updated. Can raise the bet even more by adding find_end or adding search_n or by trying to make 4 and 8 byte elements (although I'm skeptical on larger elements).

StephanTLavavej commented 13 hours ago

/azp run

azure-pipelines[bot] commented 13 hours ago
Azure Pipelines successfully started running 2 pipeline(s).