Closed glyn closed 2 years ago
On 28. Jul 2022, at 00:19, Greg Dennis @.***> wrote:
This brings up an interesting point. JSON requires that UTF-16 characters
(Do you mean beyond-BMP characters?)
be encoded in \uxxxx pairs.
You do not need to encode beyond-BMP characters, but if you do, you indeed go through UTF-16 surrogate pairs.
Is the encoding sufficient to determine correct ordering (i.e. does sorting by the encoded strings yield the same result as sorting the unencoded characters)?
No, and that is true of other backslash-encoded characters, too.
You can sort by Unicode Scalar Values. You can also sort by UTF-32 code units (obviously) and by UTF-8 code units (a.k.a. bytes) — UTF-8 was careful to preserve sorting order.
Grüße, Carsten
I think there may be room for further fine-tuning but at this point I'd like to merge this and get a nice editors' draft that we can read end-to-end.
Merging. We can fine-tune in a follow-on PR.
And:
(Reviewers may like to view this rendered version.)
The options under consideration in issue 212 were:
<
,>
,<=
, and>=
false when not comparing two numbers.This is option 2 with string comparisons.
Fixes https://github.com/ietf-wg-jsonpath/draft-ietf-jsonpath-base/issues/212