Open Calorion opened 3 months ago
I see that this has been requested before.
Here are the differences from PCRE2 that I've run into:
No support for \K.
No support for conditionals.
Does support bounded quantifiers (such as ?
and {2,5}
) in lookbehind.
Does not support recursion (?R)
(haven't run into this one, but Wikipedia lists it).
These haven't caused issues for me, but they are differences.
Doesn't support the g
flag, because there is no non-global mode. Ditto u
.
Doesn't support UAJD
flags.
Supports w
flag:
UREGEX_UWORD Controls the behavior of \b in a pattern. If set, word boundaries are found according to the definitions of word found in Unicode UAX 29, Text Boundaries. By default, word boundaries are identified by means of a simple classification of characters as either “word” or “non-word”, which approximates traditional regular expression behavior. The results obtained with the two options can be quite different in runs of spaces and other non-word characters.
\p{punct}
differs in what it matches. Java matches matches any of code>!"#$%&'()*+,-./:;<=>?@[\]^_\`{|}\~.</code From that list, ICU omits $+<=>^`|\~
ICU follows the recommendations from Unicode UTS-18, http://www.unicode.org/reports/tr18/#Compatibility_Properties. See also https://unicode-org.atlassian.net/browse/ICU-20095.
Flavor Request
Please support ICU. This is the format supported natively by Apple devices, and is used in, e.g., Siri Shortcuts.