cabo commented 1 year ago

https://github.com/ietf-wg-jsonpath/draft-ietf-jsonpath-base/pull/403#issuecomment-1439581061

Followup issue from #403

cabo commented 1 year ago

If we want to know whether a function will return a singular nodelist or potentially a full one, we need to add to the declared types.

We could define explicit conversion functions from nodelists and/or the new singular nodelist type to JSON values.

So @..a == @..b would become value(@..a) == value(@..b), making it explicit that we are not comparing the nodelists.

gregsdennis commented 1 year ago

I think #405 (type conversions) needs to be explored (and probably settled) before we can rightfully address this.

gregsdennis commented 1 year ago

We could define explicit conversion functions from nodelists and/or the new singular nodelist type to JSON values.

So @..a == @..b would become value(@..a) == value(@..b), making it explicit that we are not comparing the nodelists.

I had considered this as well, but with the conversions, I think they just muddy up the syntax.

ohler55 commented 1 year ago

To me @..a == @..b could be taken a number of different ways. Either the node lists are the same or each element has an a and a b that must be equal. It could also mean some a exists that equals some b. Wouldn't it be better to use a nested selector (sorry if my terminology is incorrect) such as @..[?(@.a == @.b)]?

gregsdennis commented 1 year ago

I just noticed the double-dot.

I'm not sure what you would mean by @..a == @..b. This isn't valid (enforced by ABNF), and I don't think anyone is arguing that it should be.

For @.a == @.b, I think the type conversions are obvious.

For something like distinct(@.a) == distinct(@.b), I'm not sure what to do. With the current conversions, you have three cases:

(please forgive the incorrect syntax)

both independently give no nodes or multiple nodes, implying (Nothing) == (Nothing) returning true
both give one node, meaning (Value) == (Value) returning whatever the result of that is (could be true or false)
one gives either no nodes or multiple nodes, and the other gives one node, meaning (Nothing) == (Value) returning false

I think the one we're hung up on here is the first case because (no nodes) != (several nodes) and also in most cases (several nodes) != (several nodes). Moreover, we don't have a mechanism to compare multiple-node nodelists at all.

ohler55 commented 1 year ago

I see your line of thinking is to compare the node lists for a boolean result. There are case where a comparison is clear for lists of one or less. If that option was to be allowed and not just return false in all cases then I would think that the lists could only consistently be compared as to the contained nodes in any order since the ancestor (..) results could be in any order if one of the ancestor nodes was an object which is generally considered to be unordered.

glyn commented 1 year ago

I think #405 (type conversions) needs to be explored (and probably settled) before we can rightfully address this.

405 is merely editorial, so it shouldn't block this issue.

glyn commented 1 year ago

PR #420 is my preferred solution to this issue. PR 420 encapsulates the surprising behavior, which is the subject of this issue, in the value() function. The use of the value() function in queries becomes an explicit and conscious choice.

The following alternative solutions have been suggested:

PR #410 proposes that all type conversions be explicit. This adds some precision to the spec and makes the well-typedness rules very compact, but at the cost of usability. The "Conversion example" section in PR 410 highlights the potential surprises remaining in the approach of PR 410.
PR #414 builds on PR 410 by adding a SingleNodeType. This further impacts usability, as highlighted by the example query $[?value(single(fn(@.*.price))) >= 7.99]. PR 414 provides a type equivalent of a singular nodelist (the result of evaluating a Singular Path). SingleNodeType is used either as a variant of NodesType in test expressions or as a way of obtaining a value in a comparison. Unlike Singular Paths, which are a syntactic device for avoiding multi-nodelist comparisons, SingleNodeType does not appear to pull its weight in terms of specification precision vs the extra complexity/verbosity involved.
PR #412 also adds the SingleNodeType, but without the usability downsides of the type conversion functions of PRs 410/414. The above critique of the SingleNodeType applies to PR 412 too. PR 412 co-opts some implicit conversion rules in the rest of the spec into the function type system, making well-typedness more complex to define and understand.

ietf-wg-jsonpath / draft-ietf-jsonpath-base

Confusion by having functions returning nodelists in comparisons #404

405 is merely editorial, so it shouldn't block this issue.