Require function names to be registered

glyn commented 1 year ago

Without this change, it is not very clear what happens when an unregistered function name is used.

gregsdennis commented 1 year ago

Registration is orthogonal to well-typedness.

With this change implementations are disallowed from supporting custom functions. This may be a crucial feature for users who don't care about interoperability.

glyn commented 1 year ago

Registration is orthogonal to well-typedness.

With this change implementations are disallowed from supporting custom functions. This may be a crucial feature for users who don't care about interoperability.

If the spec allows implementations to support custom functions, then it's the spec that doesn't care about interoperability.

I think it's better if the spec is strict/interoperable. Implementers who want custom functions and don't care about interoperability are free to drop the requirement that function names be registered (but they mustn't claim spec compliance in doing so).

cabo commented 1 year ago

I'm not sure we are fully understanding our evolution concept here. Whatever we say, there will be a period when an implementation for a new name is out there, while the name is still being registered. Reserving a space for such non-registered names has its problems (RFC 6648). So how is an implementer supposed to do this?

glyn commented 1 year ago

I'm still trying to understand what we mean by interoperation and I'm happy to be corrected, but let me set out my current thinking.

An implementer wishing to register a function extension can use it before registration. While doing so, they are not conforming to the spec and their implementation is not interoperable. If and when the function extension is registered, the implementation can conform to the spec and be interoperable with other implementations that implement the function extension. (Admittedly, this requires implementers to take a risk when attempting to introduce a new function extension.)

Without this, wouldn't we be allowing multiple non-interoperable implementations to arise? Implementers might not bother getting round to registering their custom function extensions. This could also encourage the use of unregistered, but commonly implemented function extensions.

On Thu, 20 Apr 2023, 09:03 cabo, @.***> wrote:

I'm not sure we are fully understanding our evolution concept here. Whatever we say, there will be a period when an implementation for a new name is out there, while the name is still being registered. Reserving a space for such non-registered names has its problems (RFC 6648). So how is an implementer supposed to do this?

— Reply to this email directly, view it on GitHub https://github.com/ietf-wg-jsonpath/draft-ietf-jsonpath-base/pull/467#issuecomment-1515898016, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAXF2OX4TT6ZDCYUVWY3Q3XCDUWLANCNFSM6AAAAAAXFAUNNY . You are receiving this because you authored the thread.Message ID: @.*** com>

danielaparker commented 1 year ago

Without this, wouldn't we be allowing multiple non-interoperable implementations to arise?

The only part of JSONPath that could ever be interoperable is what's defined in a particular version of the core spec. And hopefully there aren't many versions. It takes years for a new version to be widely supported, even for popular specs. JSONSchema still has spotty support for 2019-09 and 2020-12, most users are stuck on version 7.

the implementation can ... be interoperable with other implementations that implement the function extension

And not interoperable with all other implementations that do not implement the function extension, which presumably would be the vast majority of them. CBOR has extensions, but there is fallback behaviour if an extension tag is not recognized. It's hard to imagine fallback behaviour if a function name is not recognized.

Daniel

glyn commented 1 year ago

Without this, wouldn't we be allowing multiple non-interoperable implementations to arise?

The only part of JSONPath that could ever be interoperable is what's defined in a particular version of the core spec. And hopefully there aren't many versions. It takes years for a new version to be widely supported, even for popular specs. JSONSchema still has spotty support for 2019-09 and 2020-12, most users are stuck on version 7.

the implementation can ... be interoperable with other implementations that implement the function extension

And not interoperable with all other implementations that do not implement the function extension, which presumably would be the vast majority of them. CBOR has extensions, but there is fallback behaviour if an extension tag is not recognized. It's hard to imagine fallback behaviour if a function name is not recognized.

Daniel

I don't follow the logic. Suppose there are N (>=2) implementations of a given version of a JSONPath RFC, that an extra function extension (not in the RFC) is defined in the subregistry, and all N implementations happen to implement that function. (If you find this hard to believe, set N=2 and suppose the same person wrote both implementations.) Why would the RFC plus that function extension then not be interoperable?

danielaparker commented 1 year ago

I don't follow the logic. Suppose there are N (>=2) implementations of a given version of a JSONPath RFC, that an extra function extension (not in the RFC) is defined in the subregistry, and all N implementations happen to implement that function. (If you find this hard to believe, set N=2 and suppose the same person wrote both implementations.) Why would the RFC plus that function extension then not be interoperable?

I don't think that's how "interoperability" of implementations is usually understood. I think you're talking about consistency of named extensions across implementations, which is a different concern.

Daniel

cabo commented 1 year ago

I think we need to have a slightly sharper terminology to discuss this.

First of all, we have to accept that standards evolve. Extension points are an approach to make this evolution less painful, but they cannot take the pain away completely.

Note that we don’t have implementations that are intended to interoperate among each other, but we have implementations that operate in an interoperable way on some data.

One objective of the extension point is to achieve backward compatibility, i.e., making sure that evolved systems can interoperably operate on data that was designed at an earlier point of the evolution. I think we have achieved that.

The other objective of an extension point is forward compatibility: older systems providing some form of successful, interoperable operation when confronted with new data. Forward compatibility will always be limited: We wouldn’t evolve if we could achieve the new objectives with the older systems as well.

With evolution, there is no way that implementations that are on different levels of evolution can be fully interoperable with all instances.

So what is the limitation here?

What we still could have:

(1) Older systems can successfully detect the presence of an extension they do not support. So we fully prevent false interoperability. (2) Extensions have names (the function name), so we can have accurate diagnostics. (3) In several architectures, will have a way to be added dynamically to existing systems, based on the function name.

These statements require collision-free naming of extensions. A registry provides the service of collision-free naming. Other such services can be defined, usually based on hierarchical naming (ASN.1 Object IDs, various forms of URIs, etc.). Function names are not designed to interface with such naming systems. RFC 6648 tells us why it is a bad idea to embed a temporary component into what is likely to become a permanent name as deployment of the temporary component spreads.

So what can we improve?

Anyone performing a limited deployment of an extension in order to validate it and complete its detailed definition undergoes a gamble with the name.
Registering the name early reduces the stakes, but also discloses information (usually, function names should be descriptive). Registering the name late means there is greater potential for collisions, a risk similar to others that developers already need to manage.

I believe the best contribution we can make is having a good registry policy in place. Section 3.2 goes a long way, but we probably should have another look. Obviously, the choice of the designated experts who implement these policies will also influence the success; I believe it is a strength of the IETF registry model that it recognizes that not all decisions can be predetermined in a static ruleset.

Grüße, Carsten

glyn commented 1 year ago

Thanks for those helpful clarifications @danielaparker and @cabo. So, in this PR, I am focussing on consistency of named extensions across implementations. The objective I would like to achieve is that any two implementations should provide consistent behaviour for a function extension with a given name in the sense that:

If an implementation does not (yet) support the function extension, any query using the function extension will be rejected with an error.
If an implementation supports the function extension, any query using the function extension will behave according to the registered definition of the function extension.

Implementations should also be free to provide a (non-default) mode of operation where unregistered function extensions (perhaps on the way to being registered) are also supported. Users choosing to use such a non-default mode are then running a conscious risk of inconsistent behaviour between implementations.

What should the spec say about such matters? It seems to me that the current spec can be interpreted such that unregistered function extensions can be offered freely to users who could then encounter inconsistent behaviour between implementations. I think the current text of this PR is a step in the direction of making the behaviour more consistent, at least for implementations running in "spec compliant mode".

timbray commented 1 year ago

I generally agree with Glyn's points 1. and 2.

gregsdennis commented 1 year ago

Implementations should also be free to provide a (non-default) mode of operation where unregistered function extensions (perhaps on the way to being registered) are also supported. Users choosing to use such a non-default mode are then running a conscious risk of inconsistent behaviour between implementations.

This is my primary concern. Thank you for stating it.

However I feel that this change says that such unregistered functions result in a not-well-typed expression. That's simply not the case. My user can create a custom function for my implementation, and the expression will remain well-typed because they have to choose a type for their custom function when they create it.

I suggest an alternate change that states (separately from any well-typedness language):

Implementations MAY support custom functions, but such support MUST be behind a configuration option which is defaulted off. Users who make use of such custom functions acknowledge that the resulting behavior is expected to be unsupported by other implementations.

glyn commented 1 year ago

Implementations should also be free to provide a (non-default) mode of operation where unregistered function extensions (perhaps on the way to being registered) are also supported. Users choosing to use such a non-default mode are then running a conscious risk of inconsistent behaviour between implementations.

This is my primary concern. Thank you for stating it.

However I feel that this change says that such unregistered functions result in a not-well-typed expression. That's simply not the case. My user can create a custom function for my implementation, and the expression will remain well-typed because they have to choose a type for their custom function when they create it.

I suggest an alternate change that states (separately from any well-typedness language):

Implementations MAY support custom functions, but such support MUST be behind a configuration option which is defaulted off. Users who make use of such custom functions acknowledge that the resulting behavior is expected to be unsupported by other implementations.

Some language along those lines may turn out to be necessary, but talking about configuration options feels a bit too prescriptive. Implementers have total freedom once they diverge from the spec, so we can simply leave non-default behaviour unspecified. Therefore, I'd like to focus on the default behaviour in this PR.

I'd like the use of custom functions to fail with an error due to an invalid query. The rationale for this PR is that since a custom function is by default undefined, its result type is unknown and any function expression involving the custom function will then not be well typed.

gregsdennis commented 1 year ago

Implementers have total freedom once they diverge from the spec, so we can simply leave non-default behaviour unspecified.

This is my point. I'm not saying that we specify non-spec behavior. I'm saying that implementations have to hide such behavior behind an option, default off.

cabo commented 1 year ago

Our definition of well-typed requires a source of knowledge as to the return and parameter types of a function extension.

I think it is within our remit to specify that we only accept the registry as a source of that knowledge. (I think we could make this more explicit in this PR.)

I don't think we can require implementations to instantaneously update with registry updates, and much less to implement each newly registered function, so, more generally, we will already have to deal with interoperability failures due to function extensions that are not implemented (the predefined function extensions are privileged here).

How much can we put implementations on a leash with respect to accepting their own extensions (inside or outside of defined extension points)? Generally, we can't stop innovation. But we have (limited) authority to stop people from calling a query that is not interoperable due to lack of JSONPath compliance a (standard) JSONPath query. And we can strongly recommend implementing and defaulting to a strict mode (as a quality of implementation point), preventing interoperability failures by a query author believing their query is interoperable when it is not.

glyn commented 1 year ago

Closing this PR. There are too many problems with it and I'm not even sure that the original goal -- to define what happens when an implementation encounters an unregistered function extension in a query -- is achievable without getting into implementation detail. Clearly implementations must fail if they encounter function extensions they don't support.

ietf-wg-jsonpath / draft-ietf-jsonpath-base

Require function names to be registered #467