In the test re00984 the characters ⌈ and ⌉, that is, ⌈ and ⌉ are considered to be word characters, but they're in the set P, punctuation. According to the XML Schema 1.1 spec they should therefore not be considered to be in \w:
[#x0000-#x10FFFF]-[\p{P}\p{Z}\p{C}] (all characters except the set of "punctuation", "separator" and "other" characters)
In the test re00984 the characters
⌈
and⌉
, that is, ⌈ and ⌉ are considered to be word characters, but they're in the set P, punctuation. According to the XML Schema 1.1 spec they should therefore not be considered to be in\w
:See for instance here:
https://en.wiktionary.org/wiki/Appendix:Unicode/Miscellaneous_Technical
Or directly from the UCD's
UnicodeData.txt
:My proposal is to remove these two characters from the test.