Open ZeLonewolf opened 1 year ago
Also note that \u
needs a bit more escaping here: \\u
A query for node[place=city][name~"[\u1ebf]"]
(with just one backslash) does return two cities that contain this combining character (because editors and imports at the time didn’t normalize the text to NFC). Expanding the range to U+0300 to U+036F correctly returns this node.
Oh, I just got lucky because the city names happened to contain some of the letters in the hexadecimal numbers in the range. Never mind me.
So based on U+1EBF, I'm getting the following three place=city nodes (with proper unicode regex support):
<node id="369487050"/>
<node id="369487099"/>
<node id="3140507587"/>
I note that even with the escaping fixed, I still get (different) non-sensical results:
[out:csv(::id, name)][timeout:2500];
node[place=city][name~"[\\u036E-\\u036F]"];
out;
Right, I've noticed the missing backslash when revisiting #332. In the end it doesn't make a whole lot of a difference, since the underlying regular expression implementation doesn't handle ranges as expected.
I hope you received some link to a github gist to try out another implementation that works a bit better.
The following query for a range of two consecutive unicode values returns 5,747 city nodes, however, none of the returned results actually appear to contain either character.
Queries for each character individually each return zero results: