Closed david-wahlstedt closed 6 hours ago
This was more than I knew, I will make sure to adjust the parser to handle the cases you have described! Thank you for the great report.
This was more than I knew, I will make sure to adjust the parser to handle the cases you have described! Thank you for the great report.
Thanks! I tried pasting in all the binary properties(as from the output of pcre2test -LP
), and they are all accepted! I noticed that the explanation to the right says "\p{ahex} matches any characters in the ahex script", even if the "full name" of the property is Asciihexdigit
, also it doesn't say what it matches(but it works). But I tested the matching and it works correctly. I haven't tried matching with the other properties, though, only parsing.
I've added support to this in the new version. The explanation always \p{...} matches any characters in the ... script
-- is there a better way to explain this without too much manual labour?
I've added support to this in the new version. The explanation always
\p{...} matches any characters in the ... script
-- is there a better way to explain this without too much manual labour?
Thanks! Sounds like a good explanation!
Bug Description
Many unicode properties are not suported, they are too many to test and report here, and you are probably already aware of them, but here is a list of some of it:
\p{Greek}
is understood, but neither\p{greek}
nor\p{g Ree_k-}
is not. According to the pcre2pattern man pages, what's inside \p{ } should be matched "loosely", i.e., case is insignificant, ascii whitespace, hyphens and underscore are stripped away.\p{grek}
is not recognized, but the abbreviationgrek
is listed as an abbreviation forgreek
, according to the output of pcre2test -LS. (PCRE2 10.44 and higher)sc:
,Script:
, and so on, inside\p{sc:armi}
, for instance, are not supported. There are labelsscx:
andscript_extensions:
as well.\p{changEswhenlowercase d}
is not accepted. However, it seems that most (maybe all?) binary properties are supported, even with abbreviations!\p{bc:al}
is not accepted. Also here, loose matching should be applied, so\p{b C=a L }
should work, but don't.\p{Any}
is rejected. Also here the "loose matching" should apply.Reproduction steps
Enter the examples above in the regex field, and get syntax errors.
Expected Outcome
No syntax error
Browser
Chrome and Firefox on Linux, latest versions
OS
Ubuntu 22.04