Closed glyn closed 1 year ago
Another advantage of the current draft behaviour is that it is more deterministic than the implementation consensus. For example, the JSONPath $..['a', 'b']
applied to the value {"x" : {"a" : 1, "b" : 2}, "y" : {"a" : 3, "b" : 4}}
according to the current draft gives:
1
2
3
4
or
3
4
1
2
but according to the implementation consensus (with the non-determinism of https://github.com/ietf-wg-jsonpath/draft-ietf-jsonpath-base/issues/260) gives:
1
3
2
4
or
1
3
4
2
or
3
1
2
4
or
3
1
4
2
I am very much in favor of cleaning this up.
(If the odd behaviors are very widespread, we could combine the cleaned-up semantics with an interoperability warning that applications that rely on the exact sequence will be less interoperable than ones that don't. That is, of course, true of most of what the standard is providing...)
I agree that the current draft is better.
Reading the spec again,
A descendant segment produces zero or more descendants of the input value.
A nodelist enumerating the descendants is known as a descendant nodelist when:
- nodes of any array appear in array order,
- nodes appear immediately before all their descendants.
This definition does not stipulate the order in which the children of an object appear, since JSON objects are unordered.
The resultant nodelist of a descendant segment of the form
..[<selectors>]
is the result of applying the child segment[<selectors>]
to a descendant nodelist.
While it definitely has instruction for order the results of each selector, I don't think it specifies ordering one way or another for the multiple-selector case.
Using the previous example
$..['a', 'b']
applied to the value{"x" : {"a" : 1, "b" : 2}, "y" : {"a" : 3, "b" : 4}}
there are two options:
Each selector is analyzed separately and the results are concatenated.
This is equivalent to (forgive the syntax) concat( $..['a'], $..['b'] )
.
This would (in theory) require multiple data traversals: one for matching a
and one for matching b
.
All selectors are analyzed together for each node.
This is more akin to (again, forgive the syntax) $..['a' or 'b']
.
This would (in theory) process both selectors over a single traversal: as each node is visited, both selectors are applied.
It sounds like we want the latter, traversal prioritization.
@gregsdennis You quoted an old version of the spec. The current spec says:
A descendant selector of the form
..[<selectors>]
visits each node of the input value and its descendants in such an order that:
- nodes of any array are visited in array order, and
- nodes are visited before all their descendants.
It applies the child segment
[<selectors>]
to each node and concatenates the resultant nodelists together in the order in which the nodes were visited.
I think this makes the multi-selector ordering clear.
I still think it's vague on the distinction I mentioned (the two options). Between the two options, I think this text leans more towards the selector prioritization, suggesting that the selector results in their entirety are concatenated.
The current draft does not agree with the implementation consensus when a descendant segment has multiple selectors. The current draft says that a corresponding child segment containing the whole list of selectors should be applied to a descendant nodelist. The implementation consensus is that each selector is be applied to a descendant nodelist and the results concatenated together.
Let's take an example involving arrays (to avoid non-determinism confusing matters). The JSONPath
$..[0, 1]
applied to the value[1, 2, [3, 4]]
according to the current draft gives:but according to the implementation consensus gives:
One of the more influential implementations (Jayway) agrees with the current draft.
I think it is best to go with the current draft behaviour as this can be implemented in a single traversal of the descendants, but wanted to confirm this here.