Open nikkiwd opened 8 months ago
@nikkiwd Interesting. This is an issue for https://github.com/ad-freiburg/qlever and not for the QLever UI. Any idea how to fix this? We don't do anything special to inhibit this and we weren't aware of this feature so far.
It would be useful to support script properties in regexes, as described at http://www.unicode.org/reports/tr18/tr18-19.html#Script_Property
According to google/re2#234, these should work if RE2 is built against ICU.
Example:
Perl supports:
\p{Hira}
\p{sc=Hira}
\p{scx=Hira}
\p{Hiragana}
\p{sc=Hiragana}
\p{scx=Hiragana}
Of these, the only one which works in QLever is
\p{Hiragana}
(https://qlever.cs.uni-freiburg.de/wikidata/5Zj8W6) Any of the others gives an error likeInvalid SPARQL query: The regex "\p{sc=Hiragana}" is not supported by QLever (which uses Google's RE2 library). Error from RE2 is: invalid character class range: \p{sc=Hiragana}
In Wikidata's query service, the only ones which are supported are
\p{sc=Hira}
and\p{sc=Hiragana}
(https://w.wiki/7xjr), so supporting those two in particular would make it easier to write queries which work in both places.