ietf-wg-jsonpath / draft-ietf-jsonpath-base

Development of a JSONPath internet draft
https://ietf-wg-jsonpath.github.io/draft-ietf-jsonpath-base/
Other
59 stars 20 forks source link

"Index-selection query" in bracket notation #93

Open gregsdennis opened 3 years ago

gregsdennis commented 3 years ago

This discussion branches off of #88 primarily but the idea has also been discussed elsewhere (#17 & #57).

This issue covers the parenthetical bracket notation without the ?, e.g. $[(<expr>)]. Specifically, what is <expr> and what does this syntax do?

There have been many references to confusion and questions of usefulness regarding this syntax. But I think this is a syntax that is underutilized due to lack of understanding.

Explanation by way of JSON Schema

In JSON Schema, there have been many discussions around the idea of comparing instance data against other instance data. You can perform a search for "$data" on the schema spec site to read up on this, but I'll summarize.

Suppose we have these instances:

// valid
{ "value": 5, "lower-bound": 3 }

// invalid
{ "value": 5, "lower-bound": 10 }

With Schema in its current state, we can't validate these. That's what the proposed $data keyword was supposed to solve. It would provide a mechanism to reference other data for use in other keywords. In this example, it could use the value in lower-bound for the minimum keyword.

It was never adopted, however, because no one could figure out how to make it work within the existing mechanisms of JSON Schema. (Beside the point.)

(It's also similar to how discriminator works in the OpenAPI 3.1 specification.)

Back to JSON Path

I think this operation is quite useful in that it allows the path author to use data within the Argument.

Let's look at an example path: $.x[($.id)].

This, to me, says select the value at $.id. So long as this value contains a valid index, this should evaluate. We have different cases.

(Sorry, I can't make a pretty table as I'm on mobile.)

Maybe even this could be supported, though we run into "interpretation" issues of converting a string into something more meaningful to us.

I don't think that it make sense for $.id to contain a union, only a single index.

I think that if people understood this functionality better, it might get more usage. Path authors could devise some fairly complex queries that couldn't otherwise be built. I think this syntax with this function is worth including.

Note that I'm also setting aside the more common $[(@.length-1)] because we don't have a clear direction on "inherent" properties or functions. As I've defined it here, this expression would look for a length key and expect a number as its value then subtract one from that to get the index to select.


You may have noticed that I didn't put a @ in that path above. This is because it results in a narrower set of cases. I'd like to explore that now to prove my point.

$.x[@.id] would look for an id key in the value at $.x, which means that it would expect $.x to be an object. It doesn't make sense for the value at $.x.id to be a number because objects can't index this way.

However if $.x.id is a string, then we could select the value of that key under $.x without any problem.

goessner commented 3 years ago

https://github.com/ietf-wg-jsonpath/draft-ietf-jsonpath-base/issues/92#issuecomment-809983852 is meant to be an answer here also ...

gregsdennis commented 3 years ago

$[($..key[($..key[($..key[])])])] - @goessner #92

I'm not sure that this really is a concern for the spec. We are deciding function, and this syntax adds functionality that can't otherwise be achieved.

As noted in #25, I think some concerns are worth mentioning, but ultimately it's up to the implementation to decide what's best for the environment in which it runs.

gregsdennis commented 3 years ago

https://github.com/ietf-wg-jsonpath/draft-ietf-jsonpath-base/issues/92#issuecomment-809983852 is meant to be an answer here also ...

Your answer doesn't address the functionality I laid out in the opening comment.

goessner commented 3 years ago

@gregsdennis : After reading your comment above again, I cannot see another striking use case beside "internal referencing", I commented also in https://github.com/ietf-wg-jsonpath/draft-ietf-jsonpath-base/issues/92#issuecomment-809983852.

Regarding ...

// invalid
{ "value": 5, "lower-bound": 10 }

... JSONPath is meant for selection, not for validation and plausibility checks.

If we once allow the use of $ and @ with [()] we will have everything, which will be allowed in filters then ...

... but ...

and from my point of view for the single use case of "internal referencing". Long time JSONPath users should decide now about significance of this or possibly other use cases.

gregsdennis commented 3 years ago

JSONPath is meant for selection, not for validation and plausibility checks.

I wasn't saying that Path is for validation. I'm using an example from the JSON Schema spec work (where I've been very active for the past seven years) to illustrate the usefulness of a similar feature here, where we select data. I'm fully aware of what JSON Path does.

only number or string indices must result in order to make sense.

Yes. That's okay. If the value isn't a number or a string, then the result is an empty nodelist. Authors will understand the limitations here.

@ will have another meaning compared to filters. This will confuse users.

No, it won't confuse users.

This is okay, too, since it has a different context. The meaning is similar enough between the two contexts that it makes sense to reuse the operator.

In the context of [?()], which iterates over the children of the current node, the @ represents each child.

In the context of [()], there is no iteration of the current node, so the @ can only mean the current node itself.

I cannot see another striking use case beside "internal referencing"

That's it. The entire feature. There's nothing more to it. But it opens up selection mechanisms that can't otherwise be achieved.

If we once allow the use of $ and @ with [()] we will have everything

Yes. And we want everything. I want an author to be able to say "select the node before the index indicated by the value in $.foo." That would be $[($.foo-1)]. If $.foo doesn't contain a number, then the expression doesn't make sense and no nodes are returned.


We agree that $[(@.length-1)] is useless and redundant.

This proposal redefines and expands the [()] syntax.

If it makes you feel better, we can have some other symbol to signify this, like #, e.g. $[#()]. But I think this is an important capability that authors would love to have.

glyn commented 3 years ago

I don't think that it make sense for $.id to contain a union, only a single index.

Why not? I agree it might not be very useful, but to disallow it seems a bit arbitrary.

goessner commented 3 years ago

We agree that $[(@.length-1)] is useless and redundant.

Possibly it isn't. Users may request to get the array index in the middle, say by ...

arr[(floor(length(@)/2))]

Index arithmetic can get quite complex rapidly and may include string concatenation for dynamically define object member names.

Why not start simple (from users point of view) and add these complex stuff later to JSONPath 2.0, when such use cases come up?

gregsdennis commented 3 years ago

I don't think that it make sense for $.id to contain a union, only a single index.

Why not? I agree it might not be very useful, but to disallow it seems a bit arbitrary.

The idea is to allow this format as an index which can be further combined with other indices in a union, e.g. $[(@.id),'foo'].

It doesn't make sense to me to have an index be part of a union when that index can itself be a union.

goessner commented 3 years ago

As a summery following can go into a bracket selector

Note, that @ with bracket selector always addresses the current JSON value.

It is still to be defined, what type of bracket selector expression results (expr($,@)) are acceptable. Possible types are

All valid bracket selectors can be used in unions.

glyn commented 3 years ago

I don't think that it make sense for $.id to contain a union, only a single index.

Why not? I agree it might not be very useful, but to disallow it seems a bit arbitrary.

The idea is to allow this format as an index which can be further combined with other indices in a union, e.g. $[(@.id),'foo'].

It doesn't make sense to me to have an index be part of a union when that index can itself be a union.

If we replaced the index expression with its value, then unions could sensibly be part of another union. E.g. in $[(@.id),'foo'], if (@.id) was the union 'bar','baz', then the larger union would be $['bar','baz','foo'].

glyn commented 3 years ago

As a summery following can go into a bracket selector

What about filters?

goessner commented 3 years ago

Filters are special – more iterators than accessors / selectors.

Somewhere I read a proposal to possibly use other delimiters to make that visually clearer. I would like to discuss filters in a separate issue.

gregsdennis commented 3 years ago

Just listing this here as yet another example of a JSON technology using external data within its format. This bolsters the argument for us to have similar functionality, which I'm proposing for [()].

JSON Logic supports a var operator that can pull values from the input data.

cabo commented 2 years ago

We don't have expressions in indexes in -base any more. Added a revisit-after-based label so we think about this more once -base is done.