sjshuck / hs-pcre2

Complete Haskell binding to PCRE2
Apache License 2.0
12 stars 2 forks source link

Unicode is not supported #21

Closed sjshuck closed 2 years ago

sjshuck commented 2 years ago
Prelude Text.Regex.Pcre2> Text.Regex.Pcre2.matchOpt Ucp "\\d" "\x2460" -- ①
*** Exception: pcre2_compile: this version of PCRE2 does not have Unicode support
                    \d
                    ^

I thought it did. package.yaml specifies -optc=-DSUPPORT_UNICODE=1. Not sure if this is a bug; we say "UTF-16" in the docs but we don't actually say Unicode is supported. Anyway, it could be, and should, even if using built-in Unicode and not ICU.

sjshuck commented 2 years ago

(Correct behavior would be to return empty because ① is not matched by \d in UCP mode. A better example would have been matching ℵ with \w since that is a Unicode letter.)