Update to syntax breakdown

gregsdennis commented 2 years ago

Resolves #201 Resolves #255

Reorganizes and simplifies the syntax declarations.

Initial discussions yielded this document. This served as a target. I've added the target to the repo as a reference for myself as I reworked it with nicer commits. It will be removed just before merge as a last step of this PR.

I'm sure that there are some build issues. I probably have links that go nowhere. I think I got all of the ABNF reworked correctly.

Also, I've used //PICKER// and //PICKERS// as placeholders for us to find a name for these things that appear inside the []. Let's do that now.

glyn commented 2 years ago

Option 1, very very obviously. Filter pickers iterate over children of each of the nodes given, others don't. The difference between the child and the descendants selector is which nodes are presented to the pickers (the current node, or that and all of its descendants).

@cabo I'm missing the context for this comment. Are you favouring option 1?

cabo commented 2 years ago

I’m sorry, I haven’t managed to follow this 100-message thread (and I’m now on vacation). Clearly, the selector (appender) does not iterate the children, the picker (some of them) does. I think this is your option 1. Sent from mobile, sorry for terseOn 18. Sep 2022, at 08:09, Glyn Normington @.***> wrote:

Option 1, very very obviously. Filter pickers iterate over children of each of the nodes given, others don't. The difference between the child and the descendants selector is which nodes are presented to the pickers (the current node, or that and all of its descendants).

@cabo I'm missing the context for this comment. Are you favouring option 1?

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>

glyn commented 2 years ago

Thanks @cabo. You and @gregsdennis favour option 1 and I am comfortable trying to make that work. Have a great vacation!

gregsdennis commented 2 years ago

Ordering of resultant nodelists from child and descendant *selectors. This needs normative text which can be backed up by an example but which should not consist solely of an example.

I tried to address this above, but you objected saying that it describes an algorithm. I think explaining how the results are achieved is vital to ensuring that implementations get it right. I agree that it doesn't need to be (should not be) prescriptive, but I think a clear specification of the result ordering requires some amount of procedure in this case. I'd be open to a suggestion that adequately specifies the node ordering without an algorithm.

Substitution of SELECTOR for each occurrence of selector. I think selectors include $, @ and the child and descendant *selectors.

Is this because you feel that the language better suits the idea that *pickers are doing the selection in the case of the "child selector?" Personally, I think "selector" for these things is appropriate.

Choice of terms for selector and picker.

*picker, definitely. Again, I think "selector" is fine as is.

@gregsdennis What do you think remains to be done?

We need to let others come back from holiday and have their say. It's been a lot of back and forth between you and me.

glyn commented 2 years ago

Ordering of resultant nodelists from child and descendant *selectors. This needs normative text which can be backed up by an example but which should not consist solely of an example.

I tried to address this above, but you objected saying that it describes an algorithm. I think explaining how the results are achieved is vital to ensuring that implementations get it right. I agree that it doesn't need to be (should not be) prescriptive, but I think a clear specification of the result ordering requires some amount of procedure in this case. I'd be open to a suggestion that adequately specifies the node ordering without an algorithm.

Fair enough. Please could you put the algorithmic description in place in the PR and we can massage this as necessary later.

Substitution of SELECTOR for each occurrence of selector. I think selectors include $, @ and the child and descendant *selectors.

Is this because you feel that the language better suits the idea that *pickers are doing the selection in the case of the "child selector?"

Yes, and I think @cabo agrees. I'm personally open to *picker being a term other than "selector", but I just can't think of one. "picker" is a possibility, but I think "selector" and "picker" could easily be confused in readers' minds because they are synonyms.

Personally, I think "selector" for these things is appropriate.

I agree this would be less disruptive to the rest of the document. But I take the point that the child and descendant selectors are mostly gathering together nodelists which have been selected by pickers. So a term like "collector" would work for those two terms, but its hard to find a single term that covers those two terms as well as $ and @.

Choice of terms for selector and picker.

*picker, definitely. Again, I think "selector" is fine as is.

@gregsdennis What do you think remains to be done?

We need to let others come back from holiday and have their say. It's been a lot of back and forth between you and me.

Yes, I'd love to get considered opinions from the others. I was kind of hoping this would happen in week 37, but it didn't. I guess our 130 or so comments will slow people down quite a lot. It's a shame because, as you say, there has been quite a bit of back and forth during which time I've gradually warmed up to the current approach of the PR. I wonder if you should start a fresh PR to clear the decks? An advantage of that would be that others could review the change without having to wade through discussions of alternatives that were not adopted. But maybe the PR is ok because you have resolved some of the discussion threads and others can just review it "from the top".

gregsdennis commented 2 years ago

@glyn I've done a little clean-up, but I think I'm going to leave the ordering language as it is until we figure out #260 (which brings up some good points). Including that isn't critical to this PR.

glyn commented 2 years ago

@glyn I've done a little clean-up, but I think I'm going to leave the ordering language as it is until we figure out #260 (which brings up some good points). Including that isn't critical to this PR.

Thanks @gregsdennis. I'm comfortable with that. I'll add a note to #260 and update my requested changes.

timbray commented 2 years ago

I don't suppose there's a formatted version of the resulting spec that I could read? Sorry if that was linked above in an obvious way and I missed it.

gregsdennis commented 2 years ago

I don't suppose there's a formatted version of the resulting spec that I could read? Sorry if that was linked above in an obvious way and I missed it.

We've resolved the build issues. Shouldn't there be a preview generated somewhere?

Edit Yeah, you can get it by downloading the build artifact. I'm pretty sure I've seen other PRs have this posted as a comment, though.

glyn commented 2 years ago

@timbray wrote:

I don't suppose there's a formatted version of the resulting spec that I could read? Sorry if that was linked above in an obvious way and I missed it.

The current state of the art is that you have to build this yourself by running make against the branch. @cabo did some changes to make the formatting actually work, but there is still a missing piece to automate the formatting in CI (since Greg's branch is on his fork). As a temporary workaround, I've pushed a branch with the latest changes to produce this formatted version.

gregsdennis commented 2 years ago

Notes for making changes discussed in the recent interim meeting.

The way the document is organized before I made changes today has the "root selector" as a subsection of Selectors. In the meeting, we had recognized that while the output of $ is a nodelist, it is not itself a selector, and we decided on the name "starter," following the placeholder *starter that @cabo had proposed.

While working on this, I moved this subsection to its own <h2> before the *pickers section. In doing so, calling it a "starter" felt strange. I've used "root identifier" instead, and I think the text reads a bit better.

Additionally, I searched for the other starter @ (for current-item relative paths), and found that it's defined within the "filter picker" section. I think this makes sense since it only has context there, but I can move it to a subsection alongside $ if others think that's warranted.

There is no term definition for "selector," even though it's used in the definition for "Singular Path." I can add one (as "appender"), if needed.

After making the selector -> appender change, I'm not really sure I like the term "appender," but I don't really have anything else for it. I'm continuing to think on this, but "appender" will suffice for now.

The picker -> selector change reads really well in my opinion.

I'm not sure about the ordering of sections

3.4 Root Identifier
3.5 Selectors
    3.5.1 Wildcard
    3.5.2 Name
    3.5.3 Index
    3.5.4 Slice
    3.5.5 Filter
3.6 Appenders
    3.6.1 Child
    3.6.2 Descendant

I think it would be better if we do Root ... Appender ... Selector order. The idea behind this is:

It follows the order in which you encounter the syntax in a path: `$['name']. You get the root first, then an appender, then a selector
It's analogous to year-month-day in that you start broad and get narrower scopes as you go.

I'd like to get others' opinions here.

gregsdennis commented 2 years ago

Here's the HTML render of the current state.

draft-ietf-jsonpath-base.html.txt

glyn commented 2 years ago

After making the selector -> appender change, I'm not really sure I like the term "appender," but I don't really have anything else for it. I'm continuing to think on this, but "appender" will suffice for now.

I don't like the term "appender" either. It doesn't really append anything. It is appended to a path, so the term "appendage" would be more accurate (though unpleasant). Perhaps it would be better to focus on what the *appenders do: they gather together one or more selectors and apply them to a node or to the node and its descendants. So terms like "compositor", "list", "assembly", etc. may make more sense.

cabo commented 2 years ago

Step.

glyn commented 2 years ago

Step.

Hmmm. We already use "step" in the context of array slices. But I'm not sure it has a relevant connotation. What did you have in mind?

glyn commented 2 years ago

@gregsdennis I saw your commit to fix anchor refs, thanks. But I'm still getting these errors from make:

draft-ietf-jsonpath-base.xml(527): Error: Invalid attribute slugifiedName for element name, at /rfc/middle/section[3]/section[3]/name
/home/glyn/dev/ietf/draft-ietf-jsonpath-base/draft-ietf-jsonpath-base.xml(11): Warning: Invalid document after running preptool.
(No source line available): Warning: Duplicate id="name-semantics" found in generated HTML.

/cc @cabo

timbray commented 2 years ago

"segment"?

On Wed, Sept 28, 2022, 3:47 a.m. Glyn Normington @.***> wrote:

Step.

Hmmm. We already use "step" in the context of array slices. But I'm not sure it has a relevant connotation. What did you have in mind?

— Reply to this email directly, view it on GitHub https://github.com/ietf-wg-jsonpath/draft-ietf-jsonpath-base/pull/258#issuecomment-1260726820, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAEJEZZWP3KXHHIN4BXHODWAQO5ZANCNFSM6AAAAAAQEUPARI . You are receiving this because you were mentioned.Message ID: @.*** com>

cabo commented 2 years ago

On 2022-09-28, at 16:59, Tim Bray @.***> wrote:

"segment"?

piece, part, bit, section, chunk, division, portion, slice, fragment, component, wedge, lump, slab, hunk, parcel, tranche

Hmm.

I still like “step”, as each of them is a step in interpreting the query. Of course, meaningless words such as “unit” will work, too, which brings me back to “part”.

Grüße, Carsten

timbray commented 2 years ago

Actually "step" is probably my favorite. Not too semantically overloaded.

On Wed, Sep 28, 2022 at 9:51 AM cabo @.***> wrote:

On 2022-09-28, at 16:59, Tim Bray @.***> wrote:

"segment"?

piece, part, bit, section, chunk, division, portion, slice, fragment, component, wedge, lump, slab, hunk, parcel, tranche

Hmm.

I still like “step”, as each of them is a step in interpreting the query. Of course, meaningless words such as “unit” will work, too, which brings me back to “part”.

Grüße, Carsten

— Reply to this email directly, view it on GitHub https://github.com/ietf-wg-jsonpath/draft-ietf-jsonpath-base/pull/258#issuecomment-1261184845, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAEJE6ETIUQF6D7N3PFZKLWARZRZANCNFSM6AAAAAAQEUPARI . You are receiving this because you were mentioned.Message ID: @.*** com>

gregsdennis commented 2 years ago

But I'm still getting these errors from make

These are warnings, but I can look into them. I stopped when it didn't produce errors and generated the doc.

gregsdennis commented 2 years ago

I would like to re-offer "clause," similar to FROM and WHERE clauses in SQL.

gregsdennis commented 2 years ago

@glyn

(No source line available): Warning: Duplicate id="name-semantics" found in generated HTML.

I'm not sure how to fix this. It seems that the build is generating a name-semantics ID on a header separately from the one I define.

Line 484 has no explicit tag, but the HTML that is being generated is <h3 id="name-semantics">, which you can see on line 1823 of the HTML.

Line 707 has an explicit tag and is being generated as <div id="name-semantics"> on line 2146 of the HTML.

glyn commented 2 years ago

@greg That's about as far as I got when I looked into it. I had a look at the tooling and I couldn't see any obvious parameters to control the generated ids. Perhaps @cabo would know?

glyn commented 2 years ago

"step" or "clause" may work, but these refer to the structure of a query rather than its semantics. For that reason, I think "segment" works pretty well. We could then refer to a "child segment" and a "descendant segment" which produce a subset of the children and of the descendants of the input value, respectively.

gregsdennis commented 2 years ago

At @glyn's request I've restored the sequence of the sections to what matches the current main document, however I'd want this moved back eventually. I think

name
wildcard
index
slice
filter

is less understandable. At the very least "name" and "index" should be together as they both reference a single item, and "index" and "slice" should be together because they both operate specifically on arrays.

Filter should remain last because it's just a massive section.

That leaves the two orderings as

wildcard
name
index
slice
filter

and

name
index
slice
wildcard
filter

I don't mind this being a follow-up PR, though.

glyn commented 2 years ago

@gregsdennis Thanks for that. Happy with a subsequent re-ordering if that makes the spec more readable. Another option is to treat the wildcard selector as syntactic sugar for the filter ?0==0.

glyn commented 2 years ago

This PR was replaced by https://github.com/ietf-wg-jsonpath/draft-ietf-jsonpath-base/pull/263.

gregsdennis commented 2 years ago

Glyn stop making unilateral decisions. This still needs to be reviewed by the others. I realize that this isn't complete in your eyes, but that doesn't mean it can just be superceded by additions from you. We need to wait until the others can review.

glyn commented 2 years ago

Glyn stop making unilateral decisions.

I proposed raising a fresh PR two days ago on the mailing list and because there was no response, went ahead. The editors need to be free to take the initiative. Nothing I did was irreversible

This still needs to be reviewed by the others. I realize that this isn't complete in your eyes, but that doesn't mean it can just be superceded by additions from you. We need to wait until the others can review.

I would be delighted if others review this PR, but there is no evidence this is going to happen. My motivation in extending it into #263 was so that others could review this PR plus some additional editorial polish in one go.

cabo commented 2 years ago

I will review this PR, but am unlikely to get to this before Sunday.

My proposal on the remaining placeholder:

Query start ($, @) Query step (.foo, [foo] etc)

This gets rid of the potential conflict with the other use of "step".

cabo commented 2 years ago

See previous comment, https://github.com/ietf-wg-jsonpath/draft-ietf-jsonpath-base/pull/258#issuecomment-1271676697

glyn commented 2 years ago

My proposal on the remaining placeholder:

Query start ($, @) Query step (.foo, [foo] etc)

This gets rid of the potential conflict with the other use of "step".

That would work, but is a bit wordy.

I prefer terms with a semantic, rather than syntactic, connotation. I wonder if "collector" would work in place of APPENDER? I quite liked @timbray's "segment" too (and this has syntactic as well as semantic connotations).

gregsdennis commented 2 years ago

I think readers could confuse the two. - @glyn

This change isn't intended to be final. It's an intermediate change with further changes expected in subsequent PRs. There aren't going to be readers of the document until those changes go through.

I move we merge this and handle any language issues in smaller follow-up PRs so that this one doesn't continue to become muddier.

For the purposes of this PR, I don't care about the words we select. This change is about architecture, and we've all agreed that the architecture is good. Quibble over word choice somewhere else, please.

glyn commented 2 years ago

I think readers could confuse the two. - @glyn

This change isn't intended to be final. It's an intermediate change with further changes expected in subsequent PRs. There aren't going to be readers of the document until those changes go through.

I move we merge this and handle any language issues in smaller follow-up PRs so that this one doesn't continue to become muddier.

For the purposes of this PR, I don't care about the words we select. This change is about architecture, and we've all agreed that the architecture is good. Quibble over word choice somewhere else, please.

I'm comfortable merging this PR when someone else approves it (to indicate they've given it a careful review).

cabo commented 2 years ago

I was hoping I could review #263

glyn commented 2 years ago

I was hoping I could review #263

@cabo Better still!

cabo commented 2 years ago

My proposal on the remaining placeholder: Query start ($, @) Query step (.foo, [foo] etc) This gets rid of the potential conflict with the other use of "step".

That would work, but is a bit wordy.

I prefer terms with a semantic, rather than syntactic, connotation. I wonder if "collector" would work in place of APPENDER? I quite liked @timbray's "segment" too (and this has syntactic as well as semantic connotations).

Segment works for me (a segment is either a query start or a query step).

glyn commented 2 years ago

I was actually thinking of segment as a replacement for query step, with some other term for query start.

On Wed, 12 Oct 2022, 09:11 cabo, @.***> wrote:

My proposal on the remaining placeholder: Query start ($, @) Query step (.foo, [foo] etc) This gets rid of the potential conflict with the other use of "step".

That would work, but is a bit wordy.

I prefer terms with a semantic, rather than syntactic, connotation. I wonder if "collector" would work in place of APPENDER? I quite liked @timbray https://github.com/timbray's "segment" too (and this has syntactic as well as semantic connotations).

Segment works for me (a segment is either a query start or a query step).

— Reply to this email directly, view it on GitHub https://github.com/ietf-wg-jsonpath/draft-ietf-jsonpath-base/pull/258#issuecomment-1275765173, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAXF2LQ6PUK6WD4SRVYCCDWCZXDBANCNFSM6AAAAAAQEUPARI . You are receiving this because you were mentioned.Message ID: @.*** com>

glyn commented 2 years ago

@cabo I've submitted https://github.com/ietf-wg-jsonpath/draft-ietf-jsonpath-base/pull/264 to replace APPENDER with segment in PR 263. I've left the terms root identifier ($) and current node identifier (@) alone and we can replace these separately if we want to. If you are comfortable with this as a next step, please approve PR 264 and I'll merge that into PR 263 ready for your review. (The only other things that I know need fixing in PR 263 are the make warnings/errors.)

ietf-wg-jsonpath / draft-ietf-jsonpath-base

Update to syntax breakdown #258