ietf-wg-jsonpath / draft-ietf-jsonpath-base

Development of a JSONPath internet draft
https://ietf-wg-jsonpath.github.io/draft-ietf-jsonpath-base/
Other
58 stars 20 forks source link

Apply the same rules to function expressions as to non-function expressions #412

Closed glyn closed 1 year ago

glyn commented 1 year ago

By consistently applying the same rules for non-function expressions to function expressions, implicit type conversion (or at least such language) of function expressions and type conversion functions are not needed.

The rules are:

This change increases referential transparency in that certain kinds of expressions with the same result may be used interchangeably.

Fixes https://github.com/ietf-wg-jsonpath/draft-ietf-jsonpath-base/issues/404 Fixes https://github.com/ietf-wg-jsonpath/draft-ietf-jsonpath-base/issues/405

gregsdennis commented 1 year ago

The rules are:

  • Non-emptiness of nodelists constitutes an existence test in a test expression.
  • A single-node nodelist represents the value of its node when used in comparisons.

I would say these rules define conversions. I don't see á difference except that you're not using the word "conversion."

gregsdennis commented 1 year ago

Also, kudos for the branch name.

glyn commented 1 year ago

The rules are:

  • Non-emptiness of nodelists constitutes an existence test in a test expression.
  • A single-node nodelist represents the value of its node when used in comparisons.

I would say these rules define conversions. I don't see á difference except that you're not using the word "conversion."

Ok, I think implementation considerations may be confusing things, but let's look at the following text, for example:

When a path or function expression resulting in an empty nodelist appears on either side of a comparison:

  • a comparison using the operator == yields true if and only if the comparison is between two expressions, each of which is a path or a function expression, and each of which result in an empty nodelist.

This defines the behaviour of a comparison involving at least one empty nodelist directly. There is no conversion necessarily involved, although an implementation could perform a conversion if that produces the same behaviour.

Essentially, it says that comparisons of the form <empty nodelist> == ... or ... == <empty nodelist> yield true if and only if both sides are <empty nodelist>.

glyn commented 1 year ago

The rules are:

  • Non-emptiness of nodelists constitutes an existence test in a test expression.
  • A single-node nodelist represents the value of its node when used in comparisons.

I would say these rules define conversions. I don't see á difference except that you're not using the word "conversion."

Please see my other comment on this.

cabo commented 1 year ago

Essentially, it says that comparisons of the form <empty nodelist> == ... or ... == <empty nodelist> yield true if and only if both sides are <empty nodelist>.

You never get to compare nodelists. You always compare comparables, which are ValueType. These have "Nothing", not "empty nodelist".

gregsdennis commented 1 year ago

You never get to compare nodelists. You always compare comparables, which are ValueType. These have "Nothing", not "empty nodelist".

The simple solution is to make the empty nodelist and the single nodelist comparable...

cabo commented 1 year ago

On 1. Mar 2023, at 09:35, Glyn Normington @.***> wrote:

We don't need to define exist() and value() now because the pressing use cases for these functions have been dispensed with by this PR.

The problem is that not having these defined upfront makes any design of a function extension likely to contort around not having them.

Grüße, Carsten

glyn commented 1 year ago

You never get to compare nodelists. You always compare comparables, which are ValueType. These have "Nothing", not "empty nodelist".

The simple solution is to make the empty nodelist and the single nodelist comparable...

I think that confuses syntax and semantics. We don't have nodelist syntax.

gregsdennis commented 1 year ago

The simple solution is to make the empty nodelist and the single nodelist comparable...

I think that confuses syntax and semantics. We don't have nodelist syntax.

Okay. Said a different way, we make the singular path comparable.

gregsdennis commented 1 year ago

ugh... I didn't want to push to this branch!

glyn commented 1 year ago

@gregsdennis Now that PR #418 is merged here, would you like to approve this PR?

gregsdennis commented 1 year ago

I'll have a look in the morning, but I think it'd be a good idea for @cabo to do the same and ensure his questions have been addressed.

glyn commented 1 year ago

I'll have a look in the morning, but I think it'd be a good idea for @cabo to do the same and ensure his questions have been addressed.

I think @cabo may prefer https://github.com/ietf-wg-jsonpath/draft-ietf-jsonpath-base/pull/420 to this PR, so please could you review that one instead. Meanwhile, I'm going to set this PR to draft until we've decided the outcome of #420.

glyn commented 1 year ago

I think it is now clear that adding value() is superior to adding SingleNodeType. (Value is not a "conversion function", but one extraction function out of several that I could come up with to make

@..color == "red"

useful. So it solved that longstanding problem, and it provides at least one way to complete the graph of the type system.)

I agree. The extra complexity introduced by SingleNodeType isn't justified since we can address issue #404 straightforwardly using the value() function. I've also realised that there isn't a usecase for obtaining a single node where we don't immediately need the value of that node, so SingleNodeType isn't particularly useful on its own.

gregsdennis commented 1 year ago

I think it is now clear that adding value() is superior to adding SingleNodeType.

It is not clear. @..color == "red" is not useful or meaningful. The left produces multiple values, and the right is one value. They are not comparable, and it should be invalid.

If you want to separately carry on and define value() to give meaning to value(@..color) == "red", sure, but it should be unrelated to the issue that this is trying to solve.

gregsdennis commented 1 year ago

I've also realised that there isn't a usecase for obtaining a single node where we don't immediately need the value of that node, so SingleNodeType isn't particularly useful on its own. - @glyn

SingleNodeType is the only way to allow a function to be used both in a test expression and a comparison. You were correct to introduce it. If we make match() and search() return SingleNodeType, then both of these can be valid and work as expected:

$[?match(@.timezone, 'Europe/.*')]
$[?match(@.timezone, 'Europe/.*')==@.a]

If the match is found, it returns a single nodelist containing the value; otherwise it returns an empty nodelist.

I highly favor this, and this is the thing we were originally trying to solve anyway.

--

value() only requires the user to be explicit about the conversion. The problems with having multiple-nodelists in comparisons still exist, and I shouldn't have put that in the text in the first place. Doing so doesn't make sense.

If you still want to have a value() function, that's fine, but it needs to be separate from and not a replacement for SingleNodeType.

glyn commented 1 year ago

I've also realised that there isn't a usecase for obtaining a single node where we don't immediately need the value of that node, so SingleNodeType isn't particularly useful on its own. - @glyn

SingleNodeType is the only way to allow a function to be used both in a test expression and a comparison.

Allowing a function to be used in both a test expression and a comparison isn't a requirement.

You were correct to introduce it. If we make match() and search() return SingleNodeType, then both of these can be valid and work as expected:

$[?match(@.timezone, 'Europe/.*')]
$[?match(@.timezone, 'Europe/.*')==@.a]

If the match is found, it returns a single nodelist containing the value; otherwise it returns an empty nodelist.

* In a test expression, a found match evaluates to true, and no match evaluates to false.

* In a comparison, a found match evaluates to the matched value, and no match evaluates to `Nothing`.

I highly favor this, and this is the thing we were originally trying to solve anyway.

What we were originally trying to solve was that match() and search() could not be used in test expressions, which is the natural way to use them. Once we decided that these should return LogicalType there was no need for these functions to be used in comparisons. (Any desired comparison involving one of these functions has an equivalent form in which the function is used as a test expression.)

--

value() only requires the user to be explicit about the conversion. The problems with having multiple-nodelists in comparisons still exist, and I shouldn't have put that in the text in the first place. Doing so doesn't make sense.

If you still want to have a value() function, that's fine, but it needs to be separate from and not a replacement for SingleNodeType.

The spec of PR #420 is cleaner and more concise than that of PR #410, #412, or #414. SingularNodeType isn't needed and the value() function nicely encapsulates, and makes explicit the usage of, the potentially surprising behaviour that is the subject of issue #404. In my opinion PR #420 is the best solution to that issue.

gregsdennis commented 1 year ago

Once we decided that these should return LogicalType there was no need for these functions to be used in comparisons.

Any desired comparison involving one of these functions has an equivalent form in which the function is used as a test expression.

I argue that $[?match(@.timezone, 'Europe/.*')==@.a] is a legitimate use case that we had to leave behind with my other changes, and it can't be represented in a test expression. Adding SingleNodeType re-enables this. It's a good change.

SingularNodeType isn't needed and the value() function nicely encapsulates, and makes explicit the usage of, the potentially surprising behaviour that is the subject of issue https://github.com/ietf-wg-jsonpath/draft-ietf-jsonpath-base/issues/404.

Without SingleNodeType, one would have to wrap any nodelist function, even those which will only ever return a single node in value().

I'm arguing that we still need SingleNodeType, even if we adopt value().

gregsdennis commented 1 year ago

Allowing a function to be used in both a test expression and a comparison isn't a requirement.

Having functions at all isn't a requirement. What's your point?

glyn commented 1 year ago

Closing. See https://github.com/ietf-wg-jsonpath/draft-ietf-jsonpath-base/issues/404#issuecomment-1465175176 for the rationale behind this.