ietf-wg-jsonpath / draft-ietf-jsonpath-base

Development of a JSONPath internet draft
https://ietf-wg-jsonpath.github.io/draft-ietf-jsonpath-base/
Other
59 stars 20 forks source link

Insufficient non-determinism #260

Closed glyn closed 1 year ago

glyn commented 1 year ago

Currently, the JSONPath spec is insufficiently non-deterministic.

Descendant nodelist ordering

According to the current draft, in a descendant nodelist, nodes appear immediately before all their descendants. In the example given in the spec, the selector $..[*] applied to the JSON value:

{
  "o": {"j": 1, "k": 2},
  "a": [5, 3, [{"j": 4}]]
}

cannot produce the result shown in the table:

{"j": 1, "k" : 2}
[5, 3, [{"j": 4}]]
1
2
5
3
[{"j": 4}]
{"j": 4}
4

since {"j": 1, "k" : 2} does not appear immediately before 1 and 2 (there is an intervening [5, 3, [{"j": 4}]]).

Solution: delete the word "immediately".

List selector ordering

The iteration order of objects is not guaranteed to be the same from one iteration to the next. (For instance, in Go, a JSON object is usually implemented as a map and, according to the Go language spec, "The iteration order over maps is not specified and is not guaranteed to be the same from one iteration to the next".)

So a list with more than one entry applied to an object can result in a nodelist in which there are sublists in distinct orders. This is the case if each list entry in turn is used to select a nodelist from the object.

For example, the JSONPath $[?1==1, ?1==1] applied to the value:

{ "a" : 1, "b" : 2}

could produce the nodelist:

`1`
`2`
`2`
`1`

Descendant selector ordering

The descendant selectors can produce similar effects, both in the ordering of the descendant nodelists and in the list form with examples similar to that above. For example, the JSONPath $[0, 0]..[*] applied to the value:

[ {"a": 1, "b": 2}
]

can produce the nodelist:

`1`
`2`
`2`
`1`

Non-determinism of repeated runs

Repeated applications of a JSONPath to a value can result in nodelists which are ordered differently. For example, the JSONPath $[*] applied to the value:

{"a": 1, "b": 2}

could produce the nodelist:

`1`
`2`

on one occasion and the nodelist:

`2`
`1`

on a subsequent occasion.

Wildcard and filter selector ordering

Currently, there is no text to define the order of the nodelist resulting from a dot wildcard, index wildcard, or filter selector. The order should preserve array order, but be non-deterministic for objects.

gregsdennis commented 1 year ago

Currently, the JSONPath spec is insufficiently non-deterministic

Clarification: Do you mean that the spec should be more non-deterministic?

I expect that you mean you want more determinism.

gregsdennis commented 1 year ago

Repeated applications of a JSONPath to a value can result in nodelists which are ordered differently. For example, the JSONPath $[*] applied to the value:

{"a": 1, "b": 2}

could produce the nodelist:

`1`
`2`
`2`
`1`

How would you get 4 results from this? I'd expect the output to be just 1 and 2 (in any order).

gregsdennis commented 1 year ago

I agree with the sentiment here. The unorderedness of objects in general is likely going to be an issue for testing.

glyn commented 1 year ago

Currently, the JSONPath spec is insufficiently non-deterministic

Clarification: Do you mean that the spec should be more non-deterministic?

I expect that you mean you want more determinism.

No. I want less determinism to allow sufficient freedom for implementations.

glyn commented 1 year ago

How would you get 4 results from this? I'd expect the output to be just 1 and 2 (in any order).

Oops. Of course. Fixed above.

gregsdennis commented 1 year ago

I want less determinism to allow sufficient freedom for implementations.

Should we avoid specifying the order of results at all then? Or, more directly, explicitly state the the order of elements in a nodelist may be determined by the implementation and, because of non-determinism among various JSON implementations, is not subject to consistency?

glyn commented 1 year ago

I want less determinism to allow sufficient freedom for implementations.

Should we avoid specifying the order of results at all then?

No, that would be going much too far. For example, it's important that arrays still produce array-ordered nodelists.

Or, more directly, explicitly state the the order of elements in a nodelist may be determined by the implementation and, because of non-determinism among various JSON implementations, is not subject to consistency?

I agree with that statement, but I wouldn't want to litter the spec with that in multiple places. I'd prefer to use similar language to the current draft, e.g.:

This definition does not stipulate the order in which the children of an object appear, since JSON objects are unordered.

glyn commented 1 year ago

As part of addressing this issue, if PR #258 is merged, we should also define the ordering of resultant nodelists from child and descendant *selectors.

timbray commented 1 year ago

I think the spec should be specific and say that when the nodelist is constructed from members of objects, its ordering is undefined, because the fact that the ordering is not defined for objects.

Let's take this issue up tomorrow.

glyn commented 1 year ago

At the 2022-9-27 interim, there was general agreement to implement this issue. There is a need for more extensive examples, especially showing multi-layered cases. There is a concern about how to make a test suite tractable given the level of non-determinism which is being allowed by the spec.

The plan is to attack this issue after #258 lands.