Closed ritchie46 closed 1 year ago
IMO, if you can use aho-corasick
and you're otherwise not already using the regex
crate, then you probably should use aho-corasick
. Reasons:
aho-corasick
is just matching literals. regex
has a lot of code for handing the much more general case.aho-corasick
is going to build its searcher much more quickly than regex
. I hope to fix most of this in the not so distant future, but the regex
crate is always going to have some kind of additional overhead. Today, it's quite a bit more than it needs to be.Would that be a certain number inputs or would this algorithm always be preferable?
If you just have a regex like foo|bar|...|quux
, then the regex
crate will likely just use this crate.
But, you should always benchmark your specific use case. If you do have a case where regex
is faster than aho-corasick
, that would be very interesting and I should like to hear about it.
If you do have a case where regex is faster than aho-corasick, that would be very interesting and I should like to hear about it.
Promised. Will do! Thanks for the explanation.
At which point would this crate be preferred over writing a regex union in the
Regex
crate. Would that be a certain number inputs or would this algorithm always be preferable?