The terms "instance" and "abstract"

timbray commented 1 year ago

I am working my way through 2.6.1 and my progress is being hampered by the use of the undefined terms "instance" and "abstract".

I'm trying to understand this: "A type is a set of instances." The difficulty is increased when I run into undefined notations such as Value(v).

I think you're trying to say something like "A type is construct that constrains what can appear as an argument to a function and the result of evaluating a function."

I'm not sure Table 13 is helpful at all, the bullet list seems to contain all the useful information without relying on mysterious notations. And I'm not sure you need to use the words "abstract" or "abstraction" at all. For example, is the following clearer?

"ValueType" means either a JSON value or Nothing.

"LogicalType" means the result of a logical-expr. Its possible values, LogicalTrue and LogicalFalse, are not related to the JSON literals true and false and have no direct syntactical representation in JSONPath.

"NodesType" means the results of evaluating a filter-query (which appears in a test expression or as a function argument). Members of NodesType have no direct syntactical representation in JSONPath.¶

glyn commented 1 year ago

As you point out, this is tricky language for readers unfamiliar with abstract data types. It would be good to find wording which presumes less background knowledge.

I have three concerns over the rewording you propose: the loss of abstraction, whether the resultant types are well defined, and the impact on the spec.

Loss of abstraction

The types are intended to be abstract: each JSONPath implementation is free to choose its own concrete representation of each type.

Let's take ValueType as an example. As you observe, it's meant to cover two possibilities. Either it corresponds to a JSON value or to the abstract value Nothing. The value Nothing doesn't correspond to any JSON value.

So an implementation in C could represent a ValueType as a pointer to a JSON value (possibly a struct) with the null pointer representing Nothing. On the other hand, an implementation in Haskell could use a data declaration:

data ValueType = Value JSONValue | Nothing

Note that in both these implementations, ValueType is distinct from JSON values: a JSON value is not valid as a ValueType.

If we removed the "abstract" language, we'd need to ensure that these kinds of implementations, and others, are still acceptable.

Well definedness of types

The spec doesn't define a type system for all JSONPath subexpressions. In particular, logical-expr does not have a type, so it's not clear how we define LogicalType as a type in terms of the results of a logical-expr.

Similarly filter-query doesn't have a type, so the same concern applies to defining NodesType as the result of a filter-query.

Impact on the spec

If we were to define LogicalType as the result of a logical-expr, then we'd presumably need to allow any logical-expr as an argument to a parameter of type LogicalType - something the current well-typedness rules disallow. We may decide this is a desirable spec change, but it is a change.

gregsdennis commented 1 year ago

then we'd presumably need to allow any logical-expr as an argument to a parameter of type LogicalType

For example, currently myfunc(@.a==4) isn't allowed.

timbray commented 1 year ago

As you point out, this is tricky language for readers unfamiliar with abstract data types. It would be good to find wording which presumes less background knowledge.

I think this is not just good, but essential. I really don't think it's OK to introduce this terminology without defining what it means.

Let's take ValueType as an example. As you observe, it's meant to cover two possibilities. Either it corresponds to a JSON value or to the abstract value Nothing. The value Nothing doesn't correspond to any JSON value.

So an implementation in C could represent a ValueType as a pointer to a JSON value (possibly a struct) with the null pointer representing Nothing. On the other hand, an implementation in Haskell could use a data declaration:

In table 14, you have an excellent explanation of what Nothing means. I think it'd be valuable to hoist that language up near the top of the Type System section, and then when you say "JSON value or Nothing" I think that's all an implementer needs to be told. I can think, in Java or C or Go, of a variety of ways I might represent Nothing, and I can't imagine that spec language getting in my way. Actually, I went and looked again and I'm increasingly convinced that your useful definition of Nothing needs to appear in that section. before you start using it.

If we removed the "abstract" language, we'd need to ensure that these kinds of implementations, and others, are still acceptable.

Well, their acceptability is surely to be judged by whether their behaviour conforms to the spec. So the spec's job is to define behaviour with sufficient clarity to establish that. It's not obvious to me that the (unspecified) formalisms you're using are required to do that.

Well definedness of types

The spec doesn't define a type system for all JSONPath subexpressions. In particular, logical-expr does not have a type, so it's not clear how we define LogicalType as a type in terms of the results of a logical-expr.

Fair enough; I suggest that all you need to say is that LogicalType can be either LogicalTrue or LogicalFalse, which are interpreted in the usual way for booleans but are distinct from the JSON true / false values. What is "abstract" about this?

Similarly filter-query doesn't have a type, so the same concern applies to defining NodesType as the result of a filter-query.

Draft currently says "NodesType is an abstraction of a filter-query". I'm sorry, I don't understand that. What does "abstraction" mean? My guess was that you meant "what a filter-query returns" and that made sense when I ran some examples in my head but I guess I got it wrong. Hmm, evaluating a filter-query generates a NodeList, no? Is the constraint that the argument/result must be a NodeList (something that a developer who's read this far into the spec is certainly going to know how to deal with)?

glyn commented 1 year ago

@timbray I agree that it's essential to find wording which presumes less background knowledge, particularly of Abstract Data Types (you may be interested in following this link if you'd like to understand my usage of the term abstraction). I'll have a shot at a PR tomorrow.

gregsdennis commented 1 year ago

FWIW, in JSON Schema, "instance" is the term we use for the subject data (the data being validated). For us, that's the data being searched.

glyn commented 1 year ago

FWIW, in JSON Schema, "instance" is the term we use for the subject data (the data being validated). For us, that's the data being searched.

It's hard to avoid overlap with other specs!

(I'd prefer to talk about "values", rather than instances, of types, but that's a loaded term in the JSONPath spec.)

gregsdennis commented 1 year ago

I'm not worried about overlap. I was just offering another usage of the term.

cabo commented 1 year ago

(I'd prefer to talk about "values", rather than instances, of types, but that's a loaded term in the JSONPath spec.)

Yep, the JSON vobaculary is often getting in the way. We used some of this experience when defining the vocabulary for CBOR; e.g. "data item" for what is "value" in JSON, and "map" for what is "object" in JSON.

ietf-wg-jsonpath / draft-ietf-jsonpath-base