ietf-wg-jsonpath / draft-ietf-jsonpath-base

Development of a JSONPath internet draft
https://ietf-wg-jsonpath.github.io/draft-ietf-jsonpath-base/
Other
58 stars 20 forks source link

character-repertoire issues #512

Closed timbray closed 9 months ago

gregsdennis commented 9 months ago

What's the outcome of this change? Is it that surrogate pairs are no longer to be supported? What's the reasoning?

Providing some context in the description on PRs would be good.

cabo commented 9 months ago

What's the outcome of this change?

Over on the jsonpath mailing list, I wrote:

The recent extensive discussions on art@ietf.org and i18n@ietf.org have heightened our senses for potential misunderstandings of the way JSONPath handles Unicode. Pull request 512 [1] is intended to address this.

Note that while this PR occurs during IESG processing, this is not based on an IESG comment. Some of us still would like to make this change before the document is approved.

These changes are editorial in nature in that the technical content does not change, it is just expressed in a way that may give less opportunities for confusion.

So I’m requesting that the WG please have a look and speak up if you see a problem with this editorial improvement.

I think that should answer your question.

Is it that surrogate pairs are no longer to be supported?

There is no technical change here; JSONPath parsers never see surrogate pairs; these happen only inside UTF-16. The ABNF changes make it more obvious that lone surrogate code points cannot occur in a JSONPath query; that is already the case now (see Section 1.1 in conjunction with RFC 3269).

What's the reasoning?

Providing some context in the description on PRs would be good.

Sorry for putting this in the WG message only; the background is some extensive discussion on art@ and i18n@ that we don't want to repeat here in the JSONPath WG.

gregsdennis commented 9 months ago

I think that should answer your question.

No, that doesn't answer my question. I saw that email and still asked.

the background is some extensive discussion on art@ and i18n@ that we don't want to repeat here in the JSONPath WG.

Because I'm not subscribed to those, is there a summary?

timbray commented 9 months ago

Basically, the JSONPath draft has always said that the character repertoire for JSONPath is “Unicode scalar values”, which is all the code points except surrogates, which only have meaning in UTF-16. The recent changes did a few s/code point/scalar value/ and changed one piece of ABNF to make it clear that surrogates are not supported.

On Sep 21, 2023 at 2:28:44 PM, Greg Dennis @.***> wrote:

I think that should answer your question.

No, that doesn't answer my question. I saw that email and still asked.

the background is some extensive discussion on art@ and i18n@ that we don't want to repeat here in the JSONPath WG.

Because I'm not subscribed to those, is there a summary?

— Reply to this email directly, view it on GitHub https://github.com/ietf-wg-jsonpath/draft-ietf-jsonpath-base/pull/512#issuecomment-1730327489, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAEJEYKUHEUR3SO57OQY5DX3SWQZANCNFSM6AAAAAA5B2I4YM . You are receiving this because you authored the thread.Message ID: @.*** com>

gregsdennis commented 9 months ago

Thank you