ietf-wg-jsonpath / draft-ietf-jsonpath-base

Development of a JSONPath internet draft
https://ietf-wg-jsonpath.github.io/draft-ietf-jsonpath-base/
Other
59 stars 20 forks source link

Ordering of descendant segment with multiple selectors #265

Closed glyn closed 1 year ago

glyn commented 1 year ago

The current draft does not agree with the implementation consensus when a descendant segment has multiple selectors. The current draft says that a corresponding child segment containing the whole list of selectors should be applied to a descendant nodelist. The implementation consensus is that each selector is be applied to a descendant nodelist and the results concatenated together.

Let's take an example involving arrays (to avoid non-determinism confusing matters). The JSONPath $..[0, 1] applied to the value [1, 2, [3, 4]] according to the current draft gives:

1
2
3
4

but according to the implementation consensus gives:

1
3
2
4

One of the more influential implementations (Jayway) agrees with the current draft.

I think it is best to go with the current draft behaviour as this can be implemented in a single traversal of the descendants, but wanted to confirm this here.

glyn commented 1 year ago

Another advantage of the current draft behaviour is that it is more deterministic than the implementation consensus. For example, the JSONPath $..['a', 'b'] applied to the value {"x" : {"a" : 1, "b" : 2}, "y" : {"a" : 3, "b" : 4}} according to the current draft gives:

1
2
3
4

or

3
4
1
2

but according to the implementation consensus (with the non-determinism of https://github.com/ietf-wg-jsonpath/draft-ietf-jsonpath-base/issues/260) gives:

1
3
2
4

or

1
3
4
2

or

3
1
2
4

or

3
1
4
2
cabo commented 1 year ago

I am very much in favor of cleaning this up.

(If the odd behaviors are very widespread, we could combine the cleaned-up semantics with an interoperability warning that applications that rely on the exact sequence will be less interoperable than ones that don't. That is, of course, true of most of what the standard is providing...)

gregsdennis commented 1 year ago

I agree that the current draft is better.

gregsdennis commented 1 year ago

Reading the spec again,

A descendant segment produces zero or more descendants of the input value.

A nodelist enumerating the descendants is known as a descendant nodelist when:

  • nodes of any array appear in array order,
  • nodes appear immediately before all their descendants.

This definition does not stipulate the order in which the children of an object appear, since JSON objects are unordered.

The resultant nodelist of a descendant segment of the form ..[<selectors>] is the result of applying the child segment [<selectors>] to a descendant nodelist.

While it definitely has instruction for order the results of each selector, I don't think it specifies ordering one way or another for the multiple-selector case.

Using the previous example

$..['a', 'b'] applied to the value {"x" : {"a" : 1, "b" : 2}, "y" : {"a" : 3, "b" : 4}}

there are two options:

Selector prioritization

Each selector is analyzed separately and the results are concatenated.

This is equivalent to (forgive the syntax) concat( $..['a'], $..['b'] ).

This would (in theory) require multiple data traversals: one for matching a and one for matching b.

Traversal prioritization

All selectors are analyzed together for each node.

This is more akin to (again, forgive the syntax) $..['a' or 'b'].

This would (in theory) process both selectors over a single traversal: as each node is visited, both selectors are applied.


It sounds like we want the latter, traversal prioritization.

glyn commented 1 year ago

@gregsdennis You quoted an old version of the spec. The current spec says:

A descendant selector of the form ..[<selectors>] visits each node of the input value and its descendants in such an order that:

  • nodes of any array are visited in array order, and
  • nodes are visited before all their descendants.

It applies the child segment [<selectors>] to each node and concatenates the resultant nodelists together in the order in which the nodes were visited.

I think this makes the multi-selector ordering clear.

gregsdennis commented 1 year ago

I still think it's vague on the distinction I mentioned (the two options). Between the two options, I think this text leans more towards the selector prioritization, suggesting that the selector results in their entirety are concatenated.

glyn commented 1 year ago

See https://github.com/ietf-wg-jsonpath/draft-ietf-jsonpath-base/pull/269#issuecomment-1283836643.