`@` as a parameter may not be immediately intuitive for `count()`

ietf-wg-jsonpath / draft-ietf-jsonpath-base

Development of a JSONPath internet draft

https://ietf-wg-jsonpath.github.io/draft-ietf-jsonpath-base/

Other

58 stars 20 forks source link

`@` as a parameter may not be immediately intuitive for `count()` #470

Closed gregsdennis closed 1 year ago

gregsdennis commented 1 year ago

None of the examples use count(@), but I wonder if we should add a note that this will always return 1. The recommended usage should be count(@.*).

I ran into this when writing some tests. I expect others will, too.

glyn commented 1 year ago

From the title of this issue, in what way does count(@) not work as expected? Is it simply when there is confusion between count() and length()?

gregsdennis commented 1 year ago

I expected count(@) to give me the number of children of the value at @ the way that length(@) gives the length of the string at @.

So for example, with the document [{"a": "ab", "b": "bc"}], I expected to get 2 but got 1 (because {"a": "ab", "b": "bc"} is itself a single value). Thinking about it, I worked it out, but it was confusing at first. Just thought path authors may benefit from a note.

glyn commented 1 year ago

I'm not entirely convinced that a clarification is needed, but I've floated one in PR #471.

glyn commented 1 year ago

We failed to reach consensus on a fix in PR #471:

There is nothing special about @. When count() is applied to any singular query, the result is 1. This can be unexpected if the semantics of count() have been misunderstood. So adding an example of count(@) seems odd unless some explanation is given.
Attempts to clarify the semantics of count() resulted in redundant text.
count(@) is of no practical use.
The suggestion of renaming count() to remove the superficial similarity of count() and length() was not accepted.

Unless we can find another approach, this issue is going to be hard to fix.

@gregsdennis observed that someone might think that count(@) would return the number of children of @ if they have confused nodelists with arrays. The definition of nodelist says that a nodelist can be represented as a JSON array which might cause someone to think that nodelists and JSON arrays are interchangeable. If that is the root cause of this issue, perhaps some clarification of the definition is needed.

gregsdennis commented 1 year ago

Attempts to clarify the semantics of count() resulted in redundant text.

Neither of my proposals had redundant text.

I'm beside myself that no one is taking this seriously, especially considering that I, one of the authors of this text, made this mistake. If I can do it, others definitely will.

I simply wanted a clarifying note that count(@) always produces one and that subqueries on @ were needed to get the desired result.

glyn commented 1 year ago

Editorially, it's quite awkward to add an unmotivated example which is likely to raise further questions in the minds of readers, such as "What's so special about @?".

I'm rather tied up at the moment, so perhaps someone else would care to propose a solution.

glyn commented 1 year ago

I'd also like to point out that if nodelists are represented as JSON arrays, then a nodelist consisting of a single node whose value is an array x would be represented as an array of length one containing the array x. I guess that's obvious from the spec, but could be a point of misunderstanding.

danielaparker commented 1 year ago

There is nothing special about @. When count() is applied to any singular query, the result is 1.

Rather, when a count() is applied to a query that returns exactly one value, regardless of whether or not it is a "singular query", the result is 1. A "singular query" as defined in the draft can return zero or one values, so when count() is applied to any singular query, the result could be 0. (And of course there are an arbitrary number of queries that have the same property as "singular queries" that they can return at most one value.)

This can be unexpected if the semantics of count() have been misunderstood.

I think the main thing to note is that in all prior implementations of JSONPath, @ (and related) has been understood as a value, and no prior implementation has provided functions to retrieve information about @ (and related) understood as a "nodelist". Perhaps that could be highlighted.

Best regards, Daniel

cabo commented 1 year ago

There is currently no text that relates this specification to specific implementations. I think we want to keep it this way.

(There is also sufficient text that explains and uses "node list", so I think we are already covered anyway.)

danielaparker commented 1 year ago

There is currently no text that relates this specification to specific implementations. I think we want to keep it this way.

Not so much specific implementations, but all implementations, without exception. Perhaps that could be noted. Or not. Up to the committee.

cabo commented 1 year ago

I don't know all implementations, and I don't know who does, so I don't think we should be making statements of this kind.

danielaparker commented 1 year ago

Nonetheless, the statement is true, with a cutoff of 2022 to rule out new implementations that may be following the draft :-) It's demonstratable true for all the ones in the Comparisons, up to that date.

Best regards, Daniel