Closed timblechmann closed 3 years ago
This library is intrinsically coupled to boost, what are the benefits from switching from boost to std::regex. Std::regex is know for poor performance, thou this shouldn't be an issue in a bind call which is rarely a 'fast-path' operation.
what are the benefits from switching from boost to std::regex.
main benefit is binary size. shared libs of boost.regex are hundreds of kb, where std::regex is contained in the runtime library already.
Std::regex is know for poor performance
out of curiosity: do you have any reference for this?
I've seen std::regex mentioned in few blog posts as an example of something that needs improvement but can't be change because it will break the ABI , which is a real shame. I don't think the boost library 'suffers' from the ABI compatibility constraint, so it can fix things like this.
One reference is this, but its light on detail wrt regex https://cor3ntin.github.io/posts/abi/
I've run a quick bench mark this morning for fun. https://github.com/aboseley/benchmark-regex
BM_MatchWithBoost 3786 ns 3785 ns 184163
BM_MatchWithStd 49727 ns 49704 ns 14020
out of curiosity: which compiler/stl implementation do you use?
appleclang-11/libcxx seems to be a little slower, but it's nowhere as extreme as what you're seeing
BM_MatchWithBoost 4793 ns 4791 ns 108551
BM_MatchWithStd 5671 ns 5670 ns 117849
g++ (gcc) 11.1.0 on linux
at the risk of comparing apples with oranges, i've tweaked your benchmark program to cache the compiled regular expression (by making them static
) ... after all, we don't need to re-compile the regex every time we want to match a string.
in this case std is actually faster than boost (gcc-10/linux)
BM_MatchWithBoost 99.3 ns 99.3 ns 6879716
BM_MatchWithStd 88.8 ns 88.8 ns 7841490
so it seems that constructing the regex fsm from the string is faster in boost, but evaluating the fsm is faster in the stl
lies, statistics and benchmarks.
switching to libc++ and clang
------------------------------------------------------------
Benchmark Time CPU Iterations
------------------------------------------------------------
BM_MatchWithBoost 2902 ns 2900 ns 240455
BM_MatchWithStd 997 ns 996 ns 69198
Honestly the speed of this code isn't that much of issue. I can't image the speed of a bind call being an issue
If nothing else this PR reduces the binary size again a bit further
Thank-you for the PR
we can reduce the dependencies onto a boost.regex library by using std::regex and friends.
furthermore we can cache the compiled regex instead of building it every time