form semantics - Githubissues

handrews commented 7 years ago

I'm opening this to continue the discussions around the "form" proposal from #280 and #290, which have been closed in order to split everything else from those discussions out into separate PRs (#292 and #293). This issue is for tracking @jdesrosiers concerns that are not addressed by those PRs. Also paging @dlax, @Relequestual, and @awwright.

Everything below the following line is from @jdesrosiers in https://github.com/json-schema-org/json-schema-spec/pull/290#issuecomment-290609581 (the quote to which he is respoinding is from @awwright)

Can you elaborate some on how clients, as you would like to see, might act differently for different values of "method" or "form"?

Honestly, I'm a little perplexed by this question. First of all, method doesn't exist in this context. "method": "post" was replaced with "form": true and "method": "get" was removed in favor of hrefSchema. So there are only two possible values. Understanding the difference is no harder than understanding HTML. How does a client act differently when it encounters an <a> opposed to encountering a <form>.

"form": false (default) The LDO is analogous to an <a>. Really, it's closer to the Link header, but that's not really the point. A client can do anything with that link that it has enough information to do. A web browser doesn't have enough information to do anything but retrieve it. An LDO with "form": false has more information than an <a> allowing a client to do more things with it based on target hints, but it is basically the same thing.
"form": true The LDO is analogous to a <form>. A <form> is a construct for sending data to a resource. A client can proceed taking that assumption into account. For example a client knows it is going to need user input and should prompt the user for that data. This is different from a Link where you would have to first select which method you want to use, then the client determines if it needs to prompt the user for input.

It's pretty obvious how a web browser behaves differently when it encounters an <a> versus when it encounters a <form>. It should be no less obvious for a hyper-schema client encountering "form": false versus "form": true.

handrews commented 7 years ago

I've been thinking more about this and how to explain my view that "form" is unnecessary, and that the presence or absence of "submissionSchema" conveys all the necessary information.

HTML is designed to allow browsers to present unambiguous interfaces to users based on hypermedia.

JSON Hyper-Schema (JHS) is designed to inform an automated agent about the possible uses of a hypermedia representation.

The key concept for my philosophical view here are "unambiguous" and "possibilities". These words capture the very different design imperatives behind HTML and JHS.

As a human using an HTML document via a web browser, there are only two things that you can do:

Navigate from the current document to another statically identified document
Submit data for processing

Behind the scenes, HTML has two ways to handle submitted data (dynamically compute the URI to which to navigate, or submit the data to a statically identified resource for processing, which may involve navigating to a new resulting document or just resetting the current view). The only way that the distinction is visible to the user is that the browser will not allow re-submitting data to a statically identified resource for processing without an explicit confirmation from the user.

Within the scope of HTML, you cannot do anything else: there are exactly two user interface possibilities, where one has two underlying mechanisms which are largely indistinguishable to the end user. While HTML's script and link tags are also hypermedia, they are invisible to the end user and therefore not relevant to the comparisons in @jdesrosiers's comment above.

It doesn't make sense for HTML to have a way to indicate multiple possible uses of a target resource, because there is no universally sensible UI for such a thing, and the entire purpose of HTML is to construct a UI out of hypermedia. So in the relatively rare case where you want the same linked resource to be used in multiple ways, you connect to it multiple times, in each way that will produce the correct UI for that use.

You also do not have generic link relation types for anchors and forms. They mean what they mean, and any additional meaning is provided by natural language text that is significant to the end user but the contents of which are not examined or used by the browser (just displayed).

Web browsers are both generic hypermedia agents and end-user applications, and HTML's design reflects that combined use case.

On the other hand, there are no inherent limits to how an automated agent may interact with a machine-comprehendible resource, such as one represented by a JSON Hyper-Schema + a JSON instance. There is only one link serialization syntax, which is the LDO. The set of possible semantics is infinite, and conveyed by the link relation type. No generic agent can possibly know all relation types in advance.

This makes for a two-layer approach to consuming Hyper-Schema: A generic agent (which is concerned with recognizing what a link could do, but does not have the knowledge of infinitely many relation types to handle each specifically), and an application (which knows the link relation types relevant to its purpose, and can decide based on that what operations are actually available, and knows whether and how to present those available operations to the end user, if any end user even exists).

The JSON Hyper-Schema specification is concerned with the generic agent part. It needs to be clear on everything the generic agent needs to know, including indicating validation rules for various kinds of input data.

However, JSON Hyper-Schema is not concerned with the application part. Designing a set of hypermedia resources in a way that an application can consume them is about designing link relation types, and possibly a further constrained media type (either imposing additional semantics on the instance through a custom application/foo+json media type, or constraining Hyper-Schema to a subset of its full features by declaring a more restrictive meta-schema).

Fundamentally, in my view the form vs anchor distinction is not relevant to the core JSON Hyper-Schema specification. Anyone could easily define a subset of JSON Hyper-Schema that only supports HTML browser presentation behavior. Or they could define link relations equivalent to HTML's anchor and form elements. But this is not a restriction that should be imposed on Hyper-Schema in general.

There are endless possible application use cases that do not map onto the anchor/form paradigm, and JSON Hyper-Schema is responsible for enabling the whole range of possibilities.

Anthropic commented 7 years ago

@handrews I don't like the name being form, to me that implies ui, post: true/false seems more appropriate to me, ignoring your submissionSchema suggestion if the original suggestion were considered I'd rather see a different name proposed than form.

dlax commented 7 years ago

@Anthropic I also agree that the name form is not appropriate. Also considering @jdesrosiers's explanation https://github.com/json-schema-org/json-schema-spec/pull/290#issuecomment-290883189:

Try to view this from a different perspective. You have an LDO with a submissionSchema. The question isn't how do you know you can use it to send data to a resource (POST), the question is how do you know that you can't retrieve it (GET). @awwright wrote up a list of 4 or 5 ways an LDO can be used. That makes it sound versatile, but it is also ambiguous. Say I have an LDO that only supports sending data to an executable resource. I don't want to signal to a user-agent that there are 4 or 5 things you can do with this LDO when there is really only one. Using form allows me to signal that this LDO has one use only: sending data to an executable resource.

In my understanding this kind of restriction would be handled by a Allow: POST header in HTTP. Do we need a similar indication about this in LDO? Can't this be carried by relation name rel semantics?

handrews commented 7 years ago

Can't this be carried by relation name rel semantics?

In my opinion, yes, it can and should, but in cases where it's insufficient...

In my understanding this kind of restriction would be handled by a Allow: POST header in HTTP. Do we need a similar indication about this in LDO?

That's issue #73, which is my preferred solution because it's flexible unlike importing some irrelevant convention from HTML.

handrews commented 7 years ago

@handrews I don't like the name being form,

@Anthropic: It's not my proposal.

handrews commented 7 years ago

@jdesrosiers (from #290):

The web is the reference implementation for REST.

The web as a whole? Yes. HTML? No. HTML is hypermedia format designed for interactive human use, with hypermedia control semantics that are only useful in a human context. HTML is also limited to a smalls subset of HTTP that is completely inadequate for machine-oriented hypermedia, so attempting to apply its semantics in a more fully-functional system doesn't make any sense.

it would be silly to ignore it's lessons or assume they don't apply to our situation.

I'm not assuming anything. The use case of HTML and JSON Hyper-Schema are objectively extremely different.

If you want to draw analogies to the web and call that the basis of how REST should work, fine. But you have to accept the whole web, not just HTML. And that means the JavaScript that you dismiss as "hacks".

jdesrosiers commented 7 years ago

[We] could define link relations equivalent to HTML's anchor and form elements.

@handrews, I disagree that there is a significant difference between designing for user interaction vs machine interaction. But, I'm going to focus my response on the above statement because I believe this has the potential to move us forward. Defining link relations to give more semantic meaning to links seems like a reasonable way forward. We could define a "rel": "form" instead of "form": true and it would accomplish the same goal. HTML went the route of defining different structures for different things, but technically <script src="foo.js"> could have been defined as <link rel="script" href="foo.js">.

I don't like the name being form

@Anthropic I never thought it was a good name either. It was clear that it was necessary to have something other than method so I proposed the best name I could think of off the top of my head. I disagree that the word form implies UI, but I certainly see how people can see it that way. It's confusing at best. I am certainly open to suggestions for a better name, but you'll have to forgive me for continuing to use it until we have a better alternative.

In my understanding this kind of restriction would be handled by a Allow: POST header in HTTP.

@dlax That is correct, but it's not a good solution for several reasons. First of all an additional uncacheable OPTIONS request is necessary to determine the servers capabilities. This is at best impractical in the real world. We can address that problem by adding to Hyper-Schema an allow keyword that allows us to skip that extra call, but that is less than ideal as well. If we have to fall back on knowing what methods the server supports, it indicates that something is missing in our hypermedia system. That leads us too ...

Can't this be carried by relation name rel semantics?

Yes. But, again, there's a problem. Very few registered relations are actually usable in a generic way, so people end up having to build up their own application specific relations. This means that people have to write their own client that knows how to interpret those application specific relations. Imagine that every time you wanted to use a new web page, your browser would have to be extended to understand that application's custom relations.

That is essentially the bigger picture that I am trying to get to starting with form. One of the reasons the web has been so successful is because it has a simple and generic set of hypermedia constructs (such as <a>, <form>, etc) that any application can use to unambiguously describe itself. That way a generic web browser can run any application. Which bring us to ...

this is not a restriction that should be imposed on Hyper-Schema in general.

If these generic hypermedia constructs can be achieved solely by using a small set of standard relations, then I half agree with you ( @handrews ). That would mean that it is possible to use JSON Hyper-Schema as defined as a building block for a complete HATEOAS system. The part of me that runs around the internet screaming "DECOUPLE ALL THE THINGS!!!" thinks this is a great idea. However, I don't think JSON Hyper-Schema can truly achieve HATEOAS without these standard hypermedia constructs. Those constructs can be in a different document, but it is absolutely essential for it to exist.

handrews commented 7 years ago

We could define a "rel": "form" instead of "form": true and it would accomplish the same goal. HTML went the route of defining different structures for different things, but technically Githubissues.
Githubissues is a development platform for aggregating issues.

json-schema-org / json-schema-spec

form semantics #294