oceanlewis / uap-rs

User Agent Parser for Rust
MIT License
42 stars 17 forks source link

Replace fancyregex with regex #7

Closed jan-auer closed 2 years ago

jan-auer commented 2 years ago

Turns out that the plain regex crate can handle all regexes of uap-core. From what I could see, fancyregex tries to forward as much as possible to a plain regex::Regex in such cases, it's possible to get rid of the dependency. As part of the refactor, I was able to simplify a few double-references.

There is one difference: The regex crate rejects meaningless escape sequences. There are some instances in uap-core both in the referenced version 0.10 as well as in the latest 0.15. I've added a clean_escapes method that removes such escapes if they occur.

To clean invalid escapes I chose a regex, since we already have the dependency in. There are more elaborate manual ways to write this with little performance gains, but I opted for less code.

jan-auer commented 2 years ago

I will have to play with that in a follow-up, but this move would allow us to use RegexSet, see the example here.

Edit: RegexSet is not an option, it is substantially slower than the current implementation.

jan-auer commented 2 years ago

I also noticed that we can use RegexBuilder methods in device::Matcher::try_from. Let me know if you'd like me to refactor that method, too.

oceanlewis commented 2 years ago

Appreciate this, thank you! And thanks for looking into RegexSet as an alternative.