json-schema-org / json-schema-spec

The JSON Schema specification
http://json-schema.org/
Other
3.63k stars 257 forks source link

Does anyone actually use JSON Hyper-schema #48

Closed awwright closed 7 years ago

awwright commented 7 years ago

There's a lot of broken features defined in JSON Hyper-Schema, I want to ask implementors how much I can be allowed to "break" (i.e. make compliant with normative references).

Mostly things like quirks about how it defines URI templates, uses "rel", and uses "method".

Anyone?

handrews commented 7 years ago

From my point of view, I'd be happy to reboot the whole hyper-schema approach while preserving some specific elements. In my past work, we picked bits and pieces from JSON Hyper-Schema to assemble into an alternate approach that got broad buy-in where Hyper-Schema as defined did not.

That has a bit more to do with what Hyper-Schema should look like in v6, but it does indicate (for me, at least) that the v5 cleanup can be pretty aggressive since we should expect significant changes for v6 anyway.

Anthropic commented 7 years ago

@awwright I use it for defining a link with rel of "options" to then pull in the options for a select.

Totally agree that it doesn't seem to be the best spec of the bunch and more than happy to re-work if there are changes. But just wanted to submit a use case for consideration.

jdesrosiers commented 7 years ago

I use Hyper-Schema for all of my JSON based REST APIs (although I haven't had the opportunity in while). In fact, I would have very little interest in JSON Schema if it weren't for JSON Hyper-Schema. As far as I have been able to find, it is the only standard capable of doing hypermedia APIs in JSON properly (1).

In REST, resources should be self descriptive giving you everything you need to know about what it is and what you can do with it. On the web this takes the form of links and forms. Many JSON standards out there can describe links, but only JSON Hyper-Schema can do something like a form. If you are familiar with the Richardson Maturity Model (2), all the other options are designed for Level 2 REST APIs. JSON Hyper-Schema is the only Level 3 option around.

I created a little library a while back that allows you to create fully functional CRUD APIs by writing only Hyper-Schemas (3). These APIs can be used without any out-of-band knowledge using Jsonary's (4) generic Hyper-Schema browser. This library allowed me to walk through the workflow while designing APIs. Nothing else is capable of that.

This was a much more long winded response than I planned on, but my ultimate purpose is to say that I am strongly against changing the Hyper-Schema approach. There are certainly things that can be cleaned up, refactored, or extended, but I'll be keeping an eye on any Hyper-Schema proposals to ensure that the spirit of Hyper-Schema is not lost.

I'm all for the kind of changes it sounds like @awwright is talking about, but I am concerned about what I think I'm hearing from @handrews. I look forward to reading your proposals.

(1) JSON-LD has Hydra and XML has XHTML, but JSON has only JSON Hyper-Schema (2) http://martinfowler.com/articles/richardsonMaturityModel.html (3) https://github.com/jdesrosiers/resourceful (4) https://github.com/jsonary-js/jsonary

slurmulon commented 7 years ago

We use JSON Hyper-Schema extensively in both our platform API and client (really just JSON Schema in the back-end for validation purposes). I have worked on another enterprise API that was based on the Siren specification, but JSON Hyper-Schema is far superior due to its insane flexibility, its ability to keep things DRY and support for meta schemas :sparkles:. I completely agree with @jdesrosiers that Hyper-Schema is the only JSON-based hypermedia API specification that is truly Restful. My favorite aspect is how it doesn't require you to modify your API responses, it's entirely non-invasive and complementary.

I'm not sure what issues you are having resolving your refs, but we have found Ajv to be very consistent and robust in resolving them. Many libraries, such as is-my-json-valid, expect the user to provide a self-managed object mapping $ref to schema.

The only "limitation" our team has encountered (really just an initial mis-understanding) has to do with entity instances that are required/used for resolving URI Templates into absolute URLs - the issue is that some APIs want to use deeply nested resources that require multiple entities, like so:

/api/v1/a/{uuid}/b/{uuid}

The questions we ran into were "Do we use the same entity instance? Another? If it's from another, how do we specify where the instance should come from?"

Our complications were mostly due to the fact that we designed our schemas to map nearly 1:1 to our API's domain model entities, such that each schema correlated to a single high-level entity (and I believe this is the general idea, but I'm sure there's other opinions on that). We still referenced other schemas with $refs and had normalized schemas, but the general idea was that the links defined in the schema only needed to know about the entity correlating to, well, that schema.

Because of this, it wasn't initially clear that support for working with multiple entity instances when resolving LDO (Link Description Object) URI templates is possible with JSON Pointer references - otherwise the URI template slugs correlate with the defacto entity instance provided (http://json-schema.org/latest/json-schema-hypermedia.html#anchor8). The question you still have to answer in your client, though, is what entity should be used when the reference is say #/foo but the user has multiple Foos to work with in the client - should it be the first, the last, the "selected" one by the user? I think it's fine to allow the application to handle this in whatever way they want, it's just not explicitly talked about in many JSON Hyper-Schema resources/tutorials.

Once our client-side API resolves the URI Template into a URL based on the provided entity instance, we then just construct an HTTP/REST resource caller that automatically follows whatever method is defined in the LDO

Edit: couple typos

awwright commented 7 years ago

@jdesrosiers Great feedback, thanks! As how REST has the code-on-demand constraint (which I take as including scripts and stylesheets), I sort of view JSON Schema as similar to a stylesheet, JSON Schema tells you how to render or how to work with the data you've been given.

One of my outstanding concerns with JSON Hyper-schema is it has a "method" property combined with "href" and "rel". A link is supposed to describe relationships between resources, not necessarily listing things to do. Retrieving information about an associated resource is GET, HEAD, or OPTIONS, changing a resource is PUT or PATCH, executing some remote function is POST.

Can you provide some more examples of how you'd like to approach linking?

I'm re-reviewing the Richardson Maturity Model, which is so great for describing what RESTful really means. I'm also reviewing RESTful Web APIs by Leonard Richardson & Mike Amundsen (the legends of hypermedia protocols).

handrews commented 7 years ago

@jdesrosiers while I would happily reboot hyper-schema, a more incremental approach based on adding rather than changing is entirely possible.

Honestly, I've been saying provocative things about hyper-schema mostly to motivate people to actually speak up. "Resourceful" looks really interesting, I'll have to look into it more deeply. Part of my interests here have been around seeing if anyone has developed best practices for hyper-schema that mitigate the problems we found. If so, I am absolutely not dead-set on changing it.

I agree that JSON Hyper-Schema is the only thing that comes close to RMM Level 3. It is actually what helped me sell RMM Level 3 at Riverbed, but ultimately the team found hyper-schema too confusing to accept.

At a high-level there were three problems, ranging from annoyances to deal-breakers. I was going to wait and file proposed solutions (um... after I figured them out), but since we're seeing some life here, let's just look at the problems. Perhaps someone else figured out a way around these, or can explain what we missed along the way.

This is, of course, my personal opinion and recollection several years down the road, and not the official opinion of my employer at the time (or current employer).

Implicitly defined resources

Hyper-Schema implicitly defines resources by defining links to them. Every schema that has a self link is a resource (relatively explicit), and every link URI that is not otherwise accounted for in a self link is also a resource (it can be confusing to line these all up).

This means there is no straightforward listing of resources anywhere obvious. Unless every resource is accessible from the root, which is not the case in many non-trivial APIs. It also means that with link URI templates defined all over the place, it is easy to produce conflicts, which also plays into the next problem.

This is the problem that only some people considered a deal-breaker. I was not one of them, but there were very strong opinions about this. Our solution was to separate the declaration of resources and links from the schemas. Schemas could be defined inline within the resources, but usually were $ref'd in.

I would like to sort out best practices for making the resources more clear, but am not at all attached to the separate enumeration approach. I can see "fixing" this just by implementing the documentation generator in a way that pulls the resources out and enumerates them clearly in the docs.

Repetition in link definitions

There's a lot of repetition in link definitions, or else functionality left out that should be doable in one step. Simply put, if I can do all the usual CRUD stuff on books and people, and a book has a list of authors, all of the links I define on authors are redundant. They're already defined for people. But I have to re-define them with a slightly different URI template of "/people/{authorId}" instead of "/people/{id}", and re-define the schema non-authoritatively in targetSchema (which can at least be done with $ref). I either need to re-define every person link, or force the caller to first GET the author in order to discover that it can be DELETEd. However, many operations, including DELETE, should not require a GET first, so that's back to duplication.

What makes more sense when both resources are described in the schema is is to be able to define a relationship between books and people with a rel of "author" and rules for how to translate the book instance data into a form that can fill out the self link for "person." e.g. map "author_id" in book to "id" in person (assuming person's self link is something like "/people/{id}").

This approach doesn't technically make anything impossible, but it was considered excessively burdensome. The only way we would have been able to use JSON Hyper-Schema would have been to write a pre-processor that duplicated all of the links out to the proper places.

Our solution to this was just to add a "relation" field that referenced the other resource (it helped to have resources enumerated separately here, but is by no means required), and provided a map from the source schema's fields to the destination schema's self link URI template. This was done by the vars approach seen in issue #52 (extended href templating). Using the same vars mechanism in multiple places made things a lot easier- it is always the way to translate URI parameters when they don't line up perfectly with the URI template syntax.

[edit- removed the bit about href duplication from self as Hyper-Schema actually does that part fine and I just forgot]

Another benefit of this solution was that it removed the need for most targetSchema definitions. Instead, you just looked at the target resource's self link. The relevant security considerations are already described for self links. targetSchema is still available for when you want to link to something outside of JSON Hyper-Schema.

I like this solution very much, but would be happy with anything that satisfied the same requirements of eliminating the duplication and more clearly showing resource-to-resource relations.

Can't describe both URI parameters and a message body

This makes JSON Hyper-Schema unusable for me. There is only one schema in the LDO, which applies to the URI parameters for GET and the body for POST (and presumably for PUT and PATCH although they are not mentioned?). Unless I am totally misreading things (which I could be), this makes it impossible to perform operations with a message body on URIs with query parameters. This means no such operations on filtered collections (a powerful alternative to messy batch processors), or partial representations (selecting a lighter-weight format), or anything else involving non-hierarchical identifiers.

Our solution is the one described in issue #52 (extended href templating). We used URI templates for all query string parameters. URI template variables either mapped to points in the instance schema (by the default action or through vars), indicating that they used the same validation rules, or they were given a schema directly if they were parameters that must come from an external source.

The extended href templating proposal was already in existence, we just adapted it and standardized that the request schema never applies to the URI, it only applies to the request body. Again, without a solution for this, JSON Hyper-Schema cannot adequately describe my APIs.

handrews commented 7 years ago

@awwright : your comment here

One of my outstanding concerns with JSON Hyper-schema is it has a "method" property combined with "href" and "rel". A link is supposed to describe relationships between resources, not necessarily listing things to do. Retrieving information about an associated resource is GET, HEAD, or OPTIONS, changing a resource is PUT or PATCH, executing some remote function is POST.

captures my key concern in 500% less words :-)

Basically, resources are not a first-class concept in JSON Hyper-Schema. They kinda just happen, and a lot of weirdness exists around that.

awwright commented 7 years ago

@slurmulon I'm looking at "rel"... the current draft wants to add four values to the IANA registry and I'm not sure all those are necessary.

As for nested resources, unfortunately I'm not really sure how to handle the more complex URI cases.

Consider, though, if you're listing an absolute URI in a URI Template, then you're sandboxing yourself inside that namespace. It means if I want to store instances of resources on my server, I have to modify the schema so it points to my server instead of yours.

So increasingly, I think the best design pattern is that JSON documents should provide pre-computed absolute URIs or URI references, and minimize computation. The only time I would use an absolute URI in a URI Template would be if it's globally assignable anyways, like "urn:uuid:{uuid}", "ni:///sha-256;{hash_sha_256}" and the like.

The other alternative is we provide some way to let a script compute a value on a JSON instance, including provide a list of link relations. This is certainly within the scope of REST, but it seems excessive.

handrews commented 7 years ago

@awwright : We set a base URI for all of our schemas and then if the template began with $, the dollar sign was replaced with the base URI. This is similar to issue #46 (LDO baseUri) but at the scope of an entire API (a concept which JSON schema also lacks, although making a schema that is all definitions plus a schema for the entry point resource can more or less serve that purpose).

Anyway, this removed all need for absolute URIs for us except when going outside of the API entirely, in which case it was fine. We needed to be able to define URIs relative to the API base without having to repeat the API base (e.g. "/some/api/base/path" in "https://api.example.com/some/api/base/path") all over everything.

slurmulon commented 7 years ago

@awwright Regarding nested resources and how they are currently handled, I stumbled across something like this (yes, with encoding):

/api/v1/user/{ root.json%23%2Fdefinitions%2Fuser }/quotes/{ root.json%23%2Fdefinitions%2Fquote }

I'm unable to find the resource for this, I ran into it months ago and it's very possible that it isn't officially supported (not seeing anything mentioned here: https://github.com/json-schema/json-schema/wiki/href or in the spec). Anyways, it's the closest thing I could find that would allow us to support multiple entity instances (our API team was annoyed with this fact, but I don't think it's a big deal at all). We had to write our own LDO href resolver and we just didn't add support for referring to other entities, particularly because of the complexities around "which one is the right entity?" The schemas define what an entity can be, not which one it is - and in my opinion this is a very good thing, it's just something for the client to consider in its state management/design.

I agree that you should minimize computation with your links, it will generally make everybody's life easier. I think though that it will still be necessary, even for common cases such as /api/v1/user/{uuid} - this prevents the need to modify the API response in any way (:heart:) - other hypermedia specs like Siren do not have this luxury. This strict separation also ensures that the API never needs to concern itself with any sort of state (specifically to answer "which entities should be used to make this link?"), satisfying HTTP and REST.

Even with globally unique identifiers, you still need to ask the question of "which instance do I use for the other entity/entities?" in the case of LDOs with more than one entity reference. With one entity it's trivial, because you can always just associate the schema/schema validator with whatever entity you want. But when there is a nested entity, it must either also be explicitly provided by the dev (increasing domain coupling, nipping away at a clean design) or implicitly provided by some state management system, and that (to me) is the hairy part. But again, I think this is primarily a concern of layers above JSON Hyper-Schema.

Edit: clarity, grammar

slurmulon commented 7 years ago

@handrews

Basically, resources are not a first-class concept in JSON Hyper-Schema. They kinda just happen, and a lot of weirdness exists around that.

I've noticed this as well, what do you think can be done to help alleviate/fix this? What if you say that an LDO method correlates with the resource's OPTIONS when the value is an Array (please excuse the rel here, clearly not sufficient):

{
  "rel": "self",
  "method": "GET",
  "href": "/v1/api/user"
}
{
  "rel": "updates",
  "method": ["POST", "DELETE"],
  "href": "/v1/api/user/{uuid}"
}

Otherwise, if it's a single value, method should be interpreted as the method that should be used (in other words there is only one "usable" method on the resource, so there's no point to provide a collection).

It seems this could help to eliminate redundancy in the LDOs as well since there's no need to duplicate links for additional actions that can be made on a resource. Perhaps the example I provided could be collapsed even further with URI Templates, such that GET, POST and DELETE can exist in the same LDO.

awwright commented 7 years ago

@slurmulon We could have a field similar to HTTP "Allow" that specifies which methods the resource is capable of handling. So if "PUT" is missing, then it's a read-only resource; if POST exists, then it's an executable resource; etc.

@handrews See how many of those concerns have issues filed against them? I should create a milestone for this maybe.

slurmulon commented 7 years ago

@awwright yeah love it, that way method doesn't have two different meanings depending on its value

awwright commented 7 years ago

Unlike jsonschema-core and jsonschema-validation, I'm at something of a loss for how to save jsonschema-hyperschema.

My best idea right now is I'll rewrite major sections of the spec, and won't be afraid to break reverse compatibility with existing implementations. These are just Internet-Drafts, after all, and I think this is warranted because HTML, Atom, HTTP Link, HAL, etc, have much bigger adoption and are far more consistent, and have probably seen better adoption because of their consistency.

These are the following features I'm looking to gently change or preserve, everything else I'll tear out unless someone says they'd like it:

During my survey, I found or came up with these features that I think would be useful:

I'll create issues for these enhancements.

Please let me know all your opinions!

slurmulon commented 7 years ago

@awwright Wow, I did not know that Hyper-Schema was in such a dire place - I have experience with HAL, JSON-LD, Siren, and more, and JSON Hyper-Schema is by far the most extensible and non-invasive hypermedia spec I have ever encountered. Its ability to represent and work with highly-complex data structures is simply unparalleled, and the fact that it establishes a unified validation language between layers of the stack is absolutely incredible (and generally long desired!) as it completely prevents any sort of code duplication or parallel business logic across layers. Its integration with things like API Blueprint allows you rapidly accelerate productivity with tools like Drakov (mock server) and Dredd (test your API Blueprint, and the JSON Schemas defined in it, against a real API). These are all amazingly helpful features that other specifications simply don't have, don't allow, or struggle with greatly.

I'm happy to do whatever I can to help keep the spirit alive (documentation, technical/design discussions, whatever). I believe a lot of this stems from a lack of documentation / examples that give people their "Aha!" moment. It took our team a minute to realize the magic of Hyper-Schema, but once our team did it was a no-brainer as to how much it could help, and it did.


As for the feature list you mentioned, I love the idea of a base URL/URI, this would help relieve a lot of the complications regarding $ref resolution and JSON Pointer

As for readOnly, wouldn't you just implicitly know that the resource is read only when the only available accept / methods is ['GET']? Perhaps that's too indirect, but could save the need for an extra property.

Regarding title - is this intended to be used as a presentation label for the link? IME things kind of break down once you get too far down HATEOAS (i.e. Browser in a Browser) because the server is now concerned with UI presentation, and it also complicates i18n/l10n/partnerification if say the text translations are handled on the client side, which is relatively common. Perhaps in this case gettext tokens can just be sent in title and the client would be expected to replace them, not bad I guess (thinking it's probably best to make this optional..?)

I really love the idea of "forProperties, forItems, forAdditionalProperties, forAdditionalItems (descend into an array or object, but links still describe the current JSON instance)" - this would allow semantic iteration through specific areas of an instance object, which is great!

This wasn't mentioned, but I like the idea of adding $ref to LDOs (in addition to targetSchema, because to my understanding this only applies to the API response instead of the request) since it would allow the client to pre-validate an object before sending it off to the API, allowing the client to save an extra network call.

slurmulon commented 7 years ago

@awwright @handrews I would love to join the organization if you're accepting members. I use JSON Hyper-Schema in many of my projects (including enterprise ones), so I am very passionate about the role that it can play in the Hypermedia world and have a lot of incentive to keep the ball rolling.

Please feel free to email me at me@madhax.io if you'd like to discuss this further!

handrews commented 7 years ago

@slurmulon I don't have any official position here myself, I'm just talkative and opinionated :-)

awwright commented 7 years ago

@slurmulon Luckily I don't see the spirit of hyperschema changing at all, I think we're just realizing there's a small handful of things that don't mesh well with current hypermedia design best practices.

title is a standard link parameter, RFC5988 says:

The "title" parameter, when present, is used to label the destination of a link such that it can be used as a human-readable identifier (e.g., a menu entry) in the language indicated by the Content- Language header (if present). The "title" parameter MUST NOT appear more than once in a given link-value; occurrences after the first MUST be ignored by parsers.

readOnly is mostly for individual (sub-)instances, so you can say particular properties in an object are managed by the authority (the server). Things like serial numbers and computed values, where it only makes sense for the server to assign or update and not anyone else.

Regarding $ref in LDO, can you file a new issue that includes some examples and use cases?

handrews commented 7 years ago

@slurmulon and @awwright : I'm extremely confused by both of these things:

forProperties, forItems, forAdditionalProperties, forAdditionalItems (descend into an array or object, but links still describe the current JSON instance)

/api/v1/user/{ root.json%23%2Fdefinitions%2Fuser }/quotes/{ root.json%23%2Fdefinitions%2Fquote }

Are these attempts to use parts of the instance data other than the immediate properties for resolving the URI template variables? Or for using instance data from within elements of a list or properties of an object within an instance?

If so, those problems are solved very cleanly by the advanced templating proposed in in issue #52 which we used extensively throughout a set of APIs involving about 200 or so resources. I'm honestly baffled at the lack of reaction to that proposal, as it's the single most essential proposal outstanding for hyperschema for me (among other things, it is necessary to solve the failure to handle both URI parameters and message bodies, and (with relative JSON pointers) it handles URI parameters far more flexibly and elegantly than anything else I've seen proposed- no awful %37elf mess.

slurmulon commented 7 years ago

@awwright yeah "spirit" probably wasn't the best word choice - I guess I meant "momentum" since the title of this issue is "Does anyone actually use JSON Hyper-Schema?" hehe

title is a standard link parameter, RFC5988 says:

Cool, that makes sense. I was making a wrongful assumption of how the value would/could be used.

readOnly sounds great as well - is this expected to be a simple boolean or say a collection of sub-paths (i.e. JSON Pointer) to mark as read-only?

More than happy to file an issue that includes some examples and use cases. Stepping out for a few hours right now but will get to it later tonight/

handrews commented 7 years ago

@slurmulon You put readOnly throughout your schema as a boolean on properties/items/whatever that need to be readOnly. We used this a lot at Riverbed, very useful (I can point you to public example schemas if you'd like to see how we used it).

slurmulon commented 7 years ago

@handrews regarding the work you did for #52 looks extremely useful and elegant - speaking for my own example, yes, it's an attempt to use parts of the instance data to resolve URI Templates (and instance data that potentially correlates to another schema, which raises the difficult question of "which entity instance should I use?"). I'm happy to help create examples, justifications, whatever, if you think that might help get things moving

I would like to see how you are using readOnly, we really hadn't made that consideration in our platform and our API sort of just ignores setting read-only values like UUID, so I'd love to see how we can stop doing that and instead leverage JSON Hyper-Schema

jdesrosiers commented 7 years ago

I was hesitant to comment on this issue because I don't have the time to adequately keep up with the discussion. So, I'm sorry, I won't be able to answer all of the questions or comment on all of points made (at least not in a reasonable timeframe). For now, the following is how I think about Hyper-Schema in response to @awwright's inquires. As for specific difficulties with Hyper-Schema, @handrews, I'll try to address that at a later time if I can.

When thinking about REST architectures, I find it useful to make analogies to it's reference implementation, the web. However, it is important to remember that the web doesn't define REST, REST describes the web (for the most part). So, the analogy is not always perfect.

A JSON document is analogous to an HTML document. A Hyper-Schema can be linked in much the same way as a stylesheet, javascript, or image to completely describe the resource. This is where Hyper-Schema most deviates from the HTML analogy. It would be like if all the links and forms in your HTML were defined in a separate resource. It's different, but I've tentatively convinced myself that it doesn't violate any REST principles and it definitely has it's advantages. To make the analogy more clear, it helps to think of the analogy to HTML as the JSON document + the Hyper-Schema.

The next part of the analogy is hyperlinks like the anchor tag <a\>. This is the thing that most hypermedia offerings are designed to describe. @awwright, I think you are viewing Hyper-Schema links (LDOs) from this lens and thus are under the impression that they are doing too much. Although LDOs are capable of describing a hyperlink, they are not an analogy for hyperlinks.

LDOs are an analogy to an HTML <form\>. All of the things that you recognize as not belonging as part of a hyperlink, do belong in a form. There is not a hyperlink analogy in Hyper-Schema because the functionality of a form subsumes the functionality of a hyperlink. Therefore, it was unnecessary to describe them separately. This form-like functionality is the killer feature of Hyper-Schema that no one else has. I have to be careful with this part of the analogy because you shouldn't take this to mean that you necessarily should be able to generate a pretty UI for an LDO. This is the machine-to-machine version of a form.

One last analogy is Web 1.0 vs Web 2.0. Web 1.0 describes a web that is mostly static. It is mostly driven by hyperlinks and there are very few opportunities for users to interact or contribute content. Web 2.0 describes a web that is dynamic and makes heavy use of forms to interact with users. Other hypermedia solutions are capable of describing Web 1.0 style APIs were resources are largely read-only. Hyper-Schema allows a Web 2.0 style API that is naturally interactive. By naturally, I mean that there is no out-of-band information necessary to use it. That means a Hyper-Schema driven API needs no documentation to use it in the same way a website needs no documentation. Everything you need to know to use the website is included in each request.

Ok, I lied, I do have one more analogy. targetSchema was mentioned a few times. If I have my way, this keyword would be removed entirely. I know it is defined to be non-authoritative, but I think it's presence just enables and confuses people who still have not completely gotten away from the RPC mindset. When you click a link on a webpage, other than the URI, you don't get a guarantee or even a hint at what you are going to get. Whatever resource you end up with tell you what kind of resource it is and what you can do with it. You don't need to know ahead of time. This is one of the oldest concepts baked into the web and it is one of the things that has allowed it to scale the way it has.

One last thing, @handrews, I don't understand your position that Hyper-Schema is not resource-oriented. All interactions are done through interacting with a resource. The resource itself describes what you can do with it. What could possibly be more resource-oriented?

handrews commented 7 years ago

@slurmulon : For the examples, I'll send you info offlist- anyone who also wants info just comment or or start a thread on the Google Group. If anything interesting comes out of offlist discussions I'll post a summary back here.

I am a bit confused still about "instance data that potentially correlates to another schema". Do you mean the schema for a different resource, or do you mean a parent or child schema of the one defining the link?

To give credit where credit is due, the extended templating proposal was originally from Geraint Luff. We built on that with the "relations between resources" concept I mentioned earlier, but it was mostly already proposed.

slurmulon commented 7 years ago

@handrews thanks for adding me to the Google Group, sounds good.

Regarding "instance data that potentially correlates to another schema", consider this URI template:

/api/v1/user/{uuid}/quote/{uuid}

If we only had /api/v1/user/{uuid}, it's completely trivial which object we want to pluck uuid from - a valid user object that's being provided to whatever method resolves our LDO URI templates.

When there is more than one identifier slug, such as with /api/v1/user/{uuid}/quote/{uuid}, it now becomes ambiguous which schema or entity instance object uuid is supposed to map to (#/user or... what?). The only time you don't have to ask this question is if the user response always includes a denormalized quote, because then you could just do something like this:

/v1/user/{uuid}/quote/{quote.uuid}/

But if you want to keep your schemas and API resources relatively normalized (which I generally do, not sure if most people feel this way), nested identifiers introduce a problem. This is why the syntax I mentioned above, using the encoded JSON Pointers, answers this question at least partially, the question of what schema should this URI Template slug correlate to. It does not, however, answer the question of which entity instance to use - this needs to be answered by the developer for sure, but the question becomes much harder when there are two entities to juggle instead of one in the client.

handrews commented 7 years ago

@jdesrosiers : Thanks for taking the time to put that together. I have several responses :-)

This is the machine-to-machine version of a form. Yes! This. Absolutely.

I don't understand your position that Hyper-Schema is not resource-oriented. All interactions are done through interacting with a resource. The resource itself describes what you can do with it. What could possibly be more resource-oriented?

I mean that resources are not given clear visibility in a JSON Hyper-Schema based API. If I want to pull out a list of resources for documentation and determine which of those resources are related to each other in some way, I have to:

  1. Go through and find all URI templates, keeping track of the relation name and the schema to which they were attached.
  2. Any URI template that is a "self" link defines a resource described by the schema in which it is listed as "self"
  3. Group duplicate URI templates, which may be tricky to determine as Hyper-Schema requires duplicating URI templates with slightly different variable names under certain circumstances
  4. Any non-"self" URI template that is the duplicate of a "self" URI template indicates that there is a relationship (or a set of relationships, if the LDO is defined on array items) between the resource with the non-"self" link and the resource with the matching "self" link.

Whew. That's a pain in the ass to even think about, much less code. I'm not even 100% sure that step 3 is entirely possible in every situation, and there may be other subtleties that I'm missing.

You definitely cannot eyeball a non-trivial schema and figure that out in your head.

What we did at Riverbed was make one JSON document per API. This document was not a schema (which is one reason why I'm not proposing that Hyper-Schema adopt Riverbed's exact solution). The schemas were collected under a top-level types keyword. There was also a resources keyword, and the resources were listed under that. links were only directly defined on a resource for the "self" link and links sharing "self"'s URI template, or if they pointed outside of the set of resources described by our APIs (handwaving a bit here- omitting subtleties).

All other links were instead defined as relations, which explicitly named other resources from this or another API's resources section. To figure out the actual follow-able links, you went to the resource identified by the relation and looked at what links it defined. The relation definition mapped the source resource's properties into the destination resource's self URI template using vars.

This made the set of resources and the relationships among them explicit.

I could see just saying that "self" links define resources, and if you want to document the set of resources, just go through the schemas and find all of the "self" links. But the weird "figure out relations by figuring out if the URI templates resolve to the same URIs" part is where Hyper-Schema really falls down. The teams just totally balked at that.

targetSchema was mentioned a few times. If I have my way, this keyword would be removed entirely.

I'm almost 100% with you on this. As mentioned above, for connections between resources within the APIs defined by schemas, I prefer explicitly documenting the relationship, and then using the links on the relationship target. That completely removes the need for targetSchema for all of those cases (targetSchema for "self" was always a $ref back to the resource... because "self").

There are situations where you need to connect to something outside of JSON Schema. targetSchema and similar non-authoritative things are useful then. But they should otherwise be discouraged, or better yet made unnecessary.

While I like using the RFC 5988 terms as much as possible, we should not feel obligated to use parts of that that are better handled by other aspects of JSON Hyper-Schema.

slurmulon commented 7 years ago

@awwright

Regarding $ref in LDO, can you file a new issue that includes some examples and use cases?

75

Please let me know if you'd like any additional details or examples, happy to provide.

handrews commented 7 years ago

@slurmulon Could you provide an example schema that is defining that link with the ambiguous {uuid}? I feel like I can almost see the issue and explain how I would approach it, but I'm not quite sure. To me, the rules for how to find the thing to use to fill in the template are all part of the schema, although a developer can choose to fill them in with external data.

Some variables may require external data, for instance if you want to link directly from a root resource to a specific element in the collection- the developer needs to supply the search term or id directly as there is no instance in which to find it.

In the approach we used:

  1. URI template variable names must be unique
  2. No complex structure was handled anywhere within the template names, it was entirely offloaded to vars and Relative JSON Pointer (this is why I view Relative JSON Pointer as essential).

And I guess I'll put an example here instead of offlist as it might be helpful for more people. This is adapted from a published API schema but I stripped out descriptions and irrelevant properties to focus it on the definitions of the resources, links, and relations. I also "fixed" things that were from earlier in the project and didn't match how we later sorted stuff out.

The links keyword is NOT a JSON Hyper-schema LDO. It is an object in which the keys are the LDO's rel value and the relevant properties are path (replaces href using the extended templating syntax, either a single string template or a template + vars object), method, request, and response (all pretty obvious- request is always the request body, while the URI parameters are handled by vars).

The relations keyword is also an object in which the keys are rel values. The properties are resource (a URI reference to the target resource declaration) and optionally vars, which does the same thing as it does with link paths, except where vars for links maps the instance's properties of the schema declaring the link into the link's own path template variables (or declares schemas for externally supplied value validation), the vars for relations maps the instance properties from the schema declaring the relation (the source side of the relation) into the URI template for the self link of the target resource in the relation.

Note that one relation is declared from within the "networks" array items, not on the entire "networks" schema.

I was going to explain more, but probably better to stop here and just let folks ask questions.

The example is in YAML, because everyone refused to write JSON by hand- somewhere there's a translated-to-JSON version but I can't find it right now and refuse to translate it by hand myself. Most JSON-heavy projects I've been involved with have it set up so humans write YAML and the JSON is produced as a build step.

types:
    some_type: {}
    etc_pretend_all_the_types_referenced_are_actually_here: {}
resources:
    networks:
        type: array
        items:
            allOf:
              - $ref: '#/resources/network'
            relations:
                full:
                    resource: '#/resources/network'
        links:
            self:
                path:
                    template: '/networks{?name,parent_id,virtual,is_default}'
                    vars:
                        name: {$ref: '#/types/unrestricted_name'}
                        parent_id: { $ref: '#/types/identifier'}
                        virtual: {type: boolean}
                        is_default: {type: boolean}
            create:
                method: POST
                request: {$ref: '#/resources/network'}
                response: {$ref: '#/resources/network'}
            set:
                method: PUT
                request: {$ref: '#/resources/networks'}
            get:
                method: GET
                response: {$ref: '#/resources/networks'}
    network:
        type: object
        properties:
            id: { $ref: '#/types/identifier' }
            name: { $ref: '#/types/unrestricted_name' }
            parent_id:
                anyOf:
                    - $ref: '#/types/identifier'
                    - type: "null"
                readOnly: true
        required: [name]
        links:
            self: {path: '/networks/items/{id}'}
            get:
                method: GET
                response: {$ref: '#/resources/network'}
            set:
                method: PUT
                request: {$ref: '#/resources/network'}
                response: {$ref: '#/resources/network'}
            delete:
                method: DELETE
        relations:
            uplinks:
                resource: '#/resources/uplinks'
                vars: {network_id: '0/id'}
            parent:
                resource: '#/resources/network'
                vars: {id: '0/parent_id'}
            child:
                resource: '#/resources/networks'
                vars: {parent_id: '0/id'}
    uplinks:
        # Just showing the self link to show how the network resource's "uplinks" relation maps to it.
        links:
            self:
                path:
                    template: "/uplinks{?name,network_id,site_id,parent_id,virtual,is_default,is_ps_capable}"
                    vars:
                        name: {$ref: '#/types/unrestricted_name'}
                        network_id: {$ref: '#/types/identifier'}
                        site_id: {$ref: '#/types/identifier'}
                        parent_id: {$ref: '#/types/identifier'}
                        virtual: {type: boolean}
                        is_default: {type: boolean}
                        is_ps_capable: {type: boolean}
handrews commented 7 years ago

One thing I forgot is that in later API definitions we decided that you should always be able to fulfill the self link from an instance, so for things like collections we included all search terms used in a meta section of the instance. It looked something like this:

resources:
    networks:
        type: object
        properties:
            elements:
                type: array
                items:
                    allOf:
                      - $ref: '#/resources/network'
                    relations:
                        full:
                            resource: '#/resources/network'
            filters:
                type: object
                properties:
                    name: {$ref: '#/types/unrestricted_name'}
                    parent_id: {$ref: '#/types/identifier'}
                    virtual: {type: boolean}
                    is_default: {type: boolean}
        links:
            self:
                path:
                    template: '/networks/{?name,parent_id,virtual,is_default}'
                    vars:
                        name: '0/filters/name'
                        parent_id: '0/filters/parent_id'
                        virtual: '0/filters/virtual'
                        is_default: '0/filters/is_default'

Pagination was handled the same way as filtering- URI template parameters of the query string variety, with any pagination values included in the instance under a pagination property (or something like that).

awwright commented 7 years ago

@handrews Going back to an earlier post, you bring up three points: Implicitly defined resources, Repetition in link definitions, and Can't describe both URI parameters and a message body.

(1) I don't fully understand the point with "Implicitly defined resources". A resource is anything that can be given a URI. So... pretty much anything. A diary entry, an HTML document, a company, a car. A resource isn't "defined" when it is linked to, a link defines a relationship to an existing resource.

A list of resources can be expressed with the "collection" pattern. There's loads of standards that perform this task, like Atom and Collection+JSON.

(2) I'm not sure what the issue is with "Repetition in link definitions". Links contain a subject (anchor) as well as an object (target). If I have a Link: </awwright>; rel=author then I'm asserting this resource has an author; not merely that "there exists an author".

(3) The fact that HTML forms can either append data to the URI query part, or pass data in the request-body, depending on the method, is an unfortunate anachronism. URIs don't actually define what the query-part has to look like, the typical key=value&... format is one defined by HTML. I don't think JSON Schema should be adopting it.

Finally, why does your schema have separate get/set/delete "links"? I'm still not entirely sure what's wrong with an array of links (i.e. LDOs or link templates).

handrews commented 7 years ago

@awwright : I'll answer these out of order :-)

get/set/delete "links": I almost left those out, but I wanted to show where the HTTP methods actually happen. There is nothing inherently wrong with the array of links, and I've said repeatedly that I don't intend to propose adopting Riverbed's approach wholesale. While I do prefer an object where the keys are rels, just because it is so much easier to work with in code, I am not at all attached to this particular links approach, and I can live with either a list or an object.

(3) I don't know what relevance HTML forms has to my point. The problem is that Hyper-Schema mandates that the schema parameter be applied to URL template parameters in the key=value form, which is just wrong. In part, because the query string need not be key=value (I'm always vastly amused that a near-universal usage is not defined by any standard anywhere, it just exists by existing). The other reason that it's a problem is that it implies that since GET query parameters are handled by schema, and schema is used for other things for other verbs, that there is no way to specify query parameters for other verbs. This is wrong.

The correct approach is to observe that URI Templates cover every possibility, including both traditional name=value style (either starting from ? or adding more with &) or any other style. Including excellent rules for mapping objects and lists into query strings in several different flexible ways. You can also just directly map a value into the query string and not use name=value at all.

So all that needs to be done is to provide a mechanism for mapping information from anywhere in the instance into the flat URI template variable namespace. vars does this elegantly and flexibly, and even handles the situation where the value cannot come from the instance.

(1) This is way, way, way, waaaaaaaay far off from what I was talking about. I wasn't talking about a list of resources like a collection. For now let me just try to phrase it this way: If I point you at one or more JSON Hyper-Schema documents defining a large API with, say, 100 resources in it, how do you, as a human, spot those 100 resources? How do you, as a human, determine which of those resources relate to each other? Take a look at my 4-step algorithm in response to @jdesrosiers for how I think about this.

(2) This isn't even in the same universe as the point I was trying to make :-) I'm going to have to go off and think about how to present this, as I obviously totally failed to articulate anything useful at all here.

handrews commented 7 years ago

@awwright one of the reasons for the separate get/set/delete links came from the Hyper-Schema document given an example of a link rel of "edit" which would use a method of PUT (the use of "set" rather than "edit" in our actual documents is unimportant).

awwright commented 7 years ago

@handrews wrt HTML forms, I bring it up because it appears the functionality was copied from HTML, even though it's a poor design and not really suitable for hypermedia in general.

Forms in HTML serve the purpose of both URI generators and remote execution. If you want to tell a person how to jump to page n in a paginated list, you use an HTML form with method="GET". On the other hand, if you want to tell a person how to create a new resource in a collection, you use an HTML form with method="POST". They're two totally different things, and they probably should have been entirely different HTML tags.

I think JSON Schema is sort of making the same mistake. There's no need to specify "method".

It seems you're being mislead by "edit", which is a link relation defined for use in Atom. If you want to change a document, you PATCH or PUT to the URI of the document you want to edit. There's no special link relation involved to edit things (or to get things, or to patch things). (There can be an "edit-form" link relation, which shows an HTML form to edit the current resource.)

handrews commented 7 years ago

@awwright I do think there's a place for "method" as not all REST APIs use HTTP, not all link relations indicate a method (HTTP or otherwise) and many things can be done multiple ways.

I need hyper-schema to indicate whether updating this resource is done with PUT or PATCH, and if it is PATCH then what media type should be used for the request? application/patch+json, application/merge-patch+json, or something else? I need to be told whether the response from an update is expected to include an updated representation, no body, or something else. Can I create with PUT? Or do I need to create indirectly through a collection or other resource with POST? I can probably go on.

As for the exact put/set/delete stuff in the example, I honestly don't remember how we came up it, or who exactly came up with it, and in any event I don't care about that part of the approach in the slightest. No one is advocating for using that format for links, so don't worry about it. It was just there to show how our system worked, not to advocate for replacing Hyper-Schema with that system (which, as proud as I am of that work, I would not advise anyway).

awwright commented 7 years ago

Updating a resource is something that can be done with either PUT or PATCH depending on what the client thinks is most suitable. The server and the link relation can advertise its support or nonsupport for a method, but it's the server that should allow multiple ways to do things, it need not lay them all out in the link or schema.

There's numerous standards like Allow, Accept, Accept-Patch, and so on, to define the capabilities of the server.

Can you post some examples of what you think works better, please? I do like the object of links like how HAL does, I forgot to add that to the feature list.

On Oct 8, 2016 6:44 PM, "Henry Andrews" notifications@github.com wrote:

@awwright https://github.com/awwright I do think there's a place for "method" as not all REST APIs use HTTP, not all link relations indicate a method (HTTP or otherwise) and many things can be done multiple ways.

I need hyper-schema to indicate whether updating this resource is done with PUT or PATCH, and if it is PATCH then what media type should be used for the request? application/patch+json, application/merge-patch+json, or something else? I need to be told whether the response from an update is expected to include an updated representation, no body, or something else. Can I create with PUT? Or do I need to create indirectly through a collection or other resource with POST? I can probably go on.

As for the exact put/set/delete stuff in the example, I honestly don't remember how we came up it, or who exactly came up with it, and in any event I don't care about that part of the approach in the slightest. No one is advocating for using that format for links, so don't worry about it. It was just there to show how our system worked, not to advocate for replacing Hyper-Schema with that system (which, as proud as I am of that work, I would not advise anyway).

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/json-schema-org/json-schema-spec/issues/48#issuecomment-252458707, or mute the thread https://github.com/notifications/unsubscribe-auth/AAatDeFvXfQCoYHA4Mw_eA2MDoSfKxFCks5qyEcLgaJpZM4J9x5n .

handrews commented 7 years ago

I'm not concerned about advertising which methods are supported, I'm concerned with two things:

  1. JSON Hyper-schema should provide an abstraction layer between API users and HTTP

A client should not be deciding what HTTP method is suitable. A client shouldn't be messing with HTTP methods at all- it should be reading link descriptions and doing what they say. The use of HTTP is determined by the scheme of the href value, and should not just be assumed anyway.

The possibility of non-HTTP links shows up in several places already- wording such as "In an HTTP environment, this might be..." in the section on method, and the entirety of section 5.4 "JSON Schema and other protocols" in the core spec: "JSON Schema does not define any semantics for the client-server interface for any other protocols than HTTP. These semantics are application dependent, or subject to agreement between the parties involved in the use of JSON Schema for their own needs."

Even when using HTTP, I really don't want to offload any discovery to OPTIONS because of this line from RFC 7231: "Responses to the OPTIONS method are not cacheable." JSON Schemas are indefinitely cacheable and (as we have discussed elsewhere), often pre-loaded in a client.

So if I'm building an API with JSON documents that does not run over HTTP, I may or may not have alternative standards to figure out what method(s) to use, assuming "method" is a translatable concept. I have definitely put in email link URIs to give a non-HTTP example (although method is not relevant to those).

  1. Defining the resource's capabilities doesn't tell me how to use them

Yes, HTTP (assuming we're using HTTP) attaches semantics to some methods. But some of those semantics leave a lot of wiggle room. For something like PATCH, how should I use Accept-Patch? By making a non-cacheable OPTIONS call before every patch, or by ignoring the HTTP RFC and just doing it once?

And more importantly, how do I line up non-standard links that need to use POST with POST? If I don't recognize the relation name, do I do an OPTIONS on the URI and see if it allows POST and if so, just try that?

How do I figure out what to put in the body. The schema field? What about if I want to advertise validation specifics for PUT or PATCH bodies or responses? Is this where i use multiple links with the same URI? But the how do I figure out which is which?


OPTIONS is fundamentally broken which is why it's so poorly supported and rarely used (at least that I've encountered so far). And there are many things that OPTIONS cannot do, which is part of why we need Hyper-schema in the first place.

It's not burdensome for JSON Hyper-Schema to handle this stuff. The schema is the server's way of advertising what it supports, beyond the basics of HTTP. REST APIs are applications built on top of protocols and content types, whether that's HTTP and JSON or something else. Hyper-Schema should facilitate defining those applications, and not just assume that plain old HTTP handles everything that is needed.


I agree with you that Hyper-schema as it stands fundamentally muddles up the separate concepts of connecting two resources with a link and defining the operations available on the resource and how to execute them. But I believe that both are equally important for JSON Schema to specify.

handrews commented 7 years ago

@awwright OK here is my attempt to explain the duplication thing. This is also an indictment of how Hyper-schema manages HTTP methods. I very much think they should be removed from the specification of relationships between resources. But I also think it needs to be possible to specify them.

So if we instead changed method to methods and made it an object mapping methods to request and response schemas (and maybe some other stuff), that would be the minimal change needed to solve the duplication problem. It would not solve the explicit identification of all resources problem, but I'm going to have to spend more time coming up with an example for that.

Of course I could opt not to duplicate all of the target's methods for every relation, but that's also weird, and sometimes forces performing an unnecessary GET.

Anyway, here's the above example reworked into Hyper-schema (with extended templating).

Note: Please pretend the non-standard rels are proper URIs. It just made it really unreadable to make up full URIs for them.

definitions:
    networks:
        type: object
        properties:
            elements:
                type: array
                items:
                    $ref: '#/definitions/network'
            filters:
                type: object
                properties:
                    name: {$ref: '#/types/unrestricted_name'}
                    parent_id: {$ref: '#/types/identifier'}
                    virtual: {type: boolean}
                    is_default: {type: boolean}
        links:
          - rel: self
            href:
                template: '/networks/{?name,parent_id,virtual,is_default}'
                vars:
                    name: '0/filters/name'
                    parent_id: '0/filters/parent_id'
                    virtual: '0/filters/virtual'
                    is_default: '0/filters/is_default'
            method: GET
            targetSchema: {$ref: '#definitions/networks'}
          - rel: self
            method: PUT
            schema: {$ref: '#/definitions/networks'}
            targetSchema: false
    network:
        type: object
        properties:
            id: { $ref: '#/types/identifier' }
            name: { $ref: '#/types/unrestricted_name' }
            parent_id:
                anyOf:
                    - $ref: '#/types/identifier'
                    - type: "null"
                readOnly: true
        required: [name]
        links:
          - rel: self
            href: '/networks/items/{id}'
            method: GET
            targetSchema: {$ref: '#definitions/network'}
          - rel: self
            method: PUT
            schema: {$ref: '#/definitions/network'}
            targetSchema: {$ref: '#definitions/network'}
          - rel: self
            method: DELETE
            schema: false
            targetSchema: false

          - rel: create
            href: '/networks'
            method: POST
            schema: {$ref: '#/definitions/network'}
            targetSchema: {$ref: '#/definitions/network'}

          - rel: instances
            href:
                template: '/networks/{?name,parent_id,virtual,is_default}'
                vars:
                    name: {$ref: '#/types/unrestricted_name'}
                    parent_id: { $ref: '#/types/identifier'}
                    virtual: {type: boolean}
                    is_default: {type: boolean}
            method: GET
            targetSchema: {$ref: '#/definitions/networks'}
          - rel: instances
            href:
                template: '/networks/{?name,parent_id,virtual,is_default}'
                vars:
                    name: {$ref: '#/types/unrestricted_name'}
                    parent_id: { $ref: '#/types/identifier'}
                    virtual: {type: boolean}
                    is_default: {type: boolean}
            method: PUT
            schema: {$ref: '#/definitions/networks'}
            targetSchema: false

          - rel: parent
            href: '/netwoks/items/{parent_id}'
            method: GET
            targetSchema: {$ref: '#/definitions/network'}
          - rel: parent
            href: '/netwoks/items/{parent_id}'
            method: PUT
            schema: {$ref: '#/definitions/network'}
            targetSchema: {$ref: '#/definitions/network'}
          - rel: parent
            href: '/netwoks/items/{parent_id}'
            method: DELETE
            schema: false
            targetSchema: false

          - rel: children
            href: 
                template: '/networks/{?name,parent_id,virtual,is_default}'
                vars:
                    name: {$ref: '#/types/unrestricted_name'}
                    parent_id: '0/parent_id'
                    virtual: {type: boolean}
                    is_default: {type: boolean}
            method: GET
            targetSchema: {$ref: '#/definitions/networks'}
          - rel: children
            href: 
                template: '/networks/{?name,parent_id,virtual,is_default}'
                vars:
                    name: {$ref: '#/types/unrestricted_name'}
                    parent_id: '0/parent_id'
                    virtual: {type: boolean}
                    is_default: {type: boolean}
            method: PUT
            schema: {$ref: '#/definitions/networks'}
            targetSchema: false
          - rel: children
            href: 
                template: '/networks/{?name,parent_id,virtual,is_default}'
                vars:
                    name: {$ref: '#/types/unrestricted_name'}
                    parent_id: '0/parent_id'
                    virtual: {type: boolean}
                    is_default: {type: boolean}
            method: DELETE
            schema: false
            targetSchema: false

          - rel: uplinks
            href:
                template: "/uplinks{?name,network_id,site_id,parent_id,virtual,is_default,is_ps_capable}"
                vars:
                    name: {$ref: '#/types/unrestricted_name'}
                    network_id: {$ref: '#/types/identifier'}
                    site_id: {$ref: '#/types/identifier'}
                    parent_id: {$ref: '#/types/identifier'}
                    virtual: {type: boolean}
                    is_default: {type: boolean}
                    is_ps_capable: {type: boolean}
            method: GET
            targetSchema: {$ref: '#/definitions/uplinks'}
          - rel: uplinks
            href:
                template: "/uplinks{?name,network_id,site_id,parent_id,virtual,is_default,is_ps_capable}"
                vars:
                    name: {$ref: '#/types/unrestricted_name'}
                    network_id: {$ref: '#/types/identifier'}
                    site_id: {$ref: '#/types/identifier'}
                    parent_id: {$ref: '#/types/identifier'}
                    virtual: {type: boolean}
                    is_default: {type: boolean}
                    is_ps_capable: {type: boolean}
            method: PUT
            schema: {$ref: '#/definitions/uplinks'}

            child:
                resource: '#/definitions/networks'
                vars: {parent_id: '0/id'}
    uplinks:
        # Just showing the GET self link to show how the
        # network resource's "uplinks" relation maps to it.
        links:
          - rel: self
            href:
                template: "/uplinks{?name,network_id,site_id,parent_id,virtual,is_default,is_ps_capable}"
                vars:
                    name: '0/filters/name'
                    network_id: '0/filters/network_id'
                    site_id: '0/filters/site_id'
                    parent_id: '0/filters/parent_id'
                    virtual: '0/filters/virtual'
                    is_default: '0/filters/is_default'
                    is_ps_capable: '0/filters/is_ps_capable'
awwright commented 7 years ago

@handrews The problems you're describing aren't unique to JSON Schema, and for the most part, this all has known solutions.

The link itself is able to describe the features a server supports (including methods, media types, locales, and so on). But these attributes are advisory. In general, servers should be able to support a wide range of operations on resources, and clients opportunistically perform the operation they want to perform. If you want to edit a resource, you use PUT. If you want to make a small change, maybe PATCH is a better option for a client. HTTP does not need abstraction, HTTP is the abstraction by which you manipulate remote resources, and JSON Hyper-schema provides the means by which you discover new resources.

HTTP servers can in fact accept queries for and manipulate any resource with a URI. It's entirely legal, and an explicit feature, to send requests like:

GET urn:uuid:58db4c55-7611-41c5-9fff-aa3e58f57358 HTTP/1.1 GET ni:///sha-256;UyaQV-Ev4rdLoHyJJWCi11OHfrYv9E1aGQAlMO2X_-Q HTTP/1.1 PUT ftp://example.net/dir/file HTTP/1.1

Responses to these requests may be non-authoritative, but behavior for non-authoritative requests in HTTP is well defined.


When implementing the uniform interface, I don't think JSON Hyper-schema is doing a very good job of differentiating links from URI templates from remote execution.

This remote execution ability is closer to the sort of concept you seem to be talking about: The server instructs the client exactly how to format a request-body (with enctype=), and which resource to upload it to (action=).

This pattern is not one followed in general. You don't need to tell a client how to GET a resource or how to PATCH a resource. It's something both client and server already understand because of the uniform interface.

There are link relationships that, when made, imply certain things about the link target. The "hub" link relation effectively means "Target is a resource that manages notification subscriptions for the current document. When POSTed to, it registers a notification endpoint..." This concept isn't terribly different than saying "if { <A> author <B> . }, then A must be a document and B must be an author".

Collections, likewise, are typically defined so that POSTing to them creates a new instance in the collection. No explicit link relation is needed, other than a link relation (or media type) that asserts the resource is a collection.

Take a look at http://amundsen.com/hypermedia/hfactor/ which documents the different kinds of links used in hypermedia.


The example I'm most interested in is what you think is the best way to define a Hyper-schema.

Pare down the example you gave to just the link relationships. You have three "self" and "parent" links that point to the same resource, there should only be one. Remove the "create" link relationship since it makes no sense to have an assertion { <A> create <B> . }. Replace "children" and "instances" with "item" if that's what you intend. Maybe organize the links into a key-value map like HAL does.

handrews commented 7 years ago

@awwright : The multiple links with the same rel are not something I think is good, they're me trying to illustrate the problems that the current approach to method and href forces on us.

Independent of whether you like my specific rel value choices, do you agree that JSON Hyper-schema, as it stands today in Draft 04, encourages this duplication? It's very important that we get clear on this, as if we do not agree on the starting point it is impossible to meaningfully discuss change. I feel like you are focusing so much on the solution of removing method that we're not getting clear on the current system's structural problems first.

@jdesrosiers @slurmulon how does all of this line up with your usage of Hyper-schema? Do you see the forced repetition problem that I believe is present? Or did you solve that another way or just not encountered a situation that would highlight this? I really want to understand if I am the only person who sees this concern with the system as it is now in Draft 04.

Let's get clear on Draft 04's problems before discussing solutions. Or critiquing my specific choices of rel: "instances", "create", "parent", and "children" all come directly from the hyper-schema specification either in its examples or defined by the spec (the example uses "up" instead of "parent" but it's close enough). I chose them in an attempt to avoid bikeshedding over the API design so we could focus on the structural problems of hyper-schema. I've already commented elsewhere that the spec ignoring the IANA "collection" and "item" relations is problematic, so getting lectured on using "item" instead of the examples that are in the spec is frustrating.

I'll address other stuff in separate comments. I really, really, really want to get agreement on the current problems with hyper-schema Draft 04 (and not API design problems with the example, especially not when I lifted the "bad designs" directly from the spec!) before proceeding.

handrews commented 7 years ago

@awwright : FWIW, you and I are in total agreement on this:

When implementing the uniform interface, I don't think JSON Hyper-schema is doing a very good job of differentiating links from URI templates from remote execution.

I also have thoughts on other parts of your last response. But I don't want to proceed with this discussion until I'm sure we're on the same page about what JSON Hyper-schema Draft 04 requires, and what problems it causes, without mixing in any discussion of solutions or the world outside of JSON Hyper-schema Draft 04. Then, once that's clear, we can proceed with discussing solutions.

jdesrosiers commented 7 years ago

@handrews, I don't think we are using the same definition of a "resource". To help illustrate, imagine an API for a bookstore. Among other things it has a definition of what the structure of a Book should be. The server maintains a list of Book instances. When I talk about resources, I'm referring to the instances of Books. I think you are referring to the structure of a Book. In OO terms, I'm referring to objects and you're referring to classes. You're concept of resource-oriented seems more like class-oriented (for lack of a better term). I think Hyper-Schema is so strongly resource-oriented that the classes can sometimes become a bit obscured. I'm ok with that. I think resource-oriented is the way to go.


@awwright, I see what you mean about LDO pulling triple duty, and I can see how splitting it up can have benefits. But, there is so much overlap between the three concepts that splitting them up could lead to unnecessary duplication as well. I'm not really sure.

Regarding the method keyword, are you saying that you don't think it is ever needed? Or, just shouldn't be allowed for links? In the third use of an LDO (forms), I don't see how it would make sense to not be able to specify a method.


Do you see the forced repetition problem that I believe is present?

@handrews, I don't really understand what you are getting at. But, no, I don't recall ever feeling like I was forced to repeat myself when creating Hyper-Schemas. The closest thing would be trying to describe a field that can only be written when the resource is created. The readOnly keyword is close, but a little too restrictive in this case. One solution has been to define a schema for creating a resource separate from the schema used for updating the resource even though they are mostly the same. This is the only case where I felt forced to repeat myself, but I'm pretty sure this is a very different issue than you are talking about.

handrews commented 7 years ago

@jdesrosiers : You have some good points with your resource and class analogy, but that's not actually how I'm looking at it. In fact, settling the terminology of "representation" (which is what schemas describe) vs "resource" (which is an abstraction generally\ identified by a URI) was key for the success of the project at Riverbed. Conflating the abstraction of a resource with the concrete representation and calling them all resources created a muddled mess of confusion.

So your instances, to me, could either be resource instances (if we're discussing those books in the list abstractly) or instance representations (if we're talking about the JSON documents that get sent back and forth when I GET the list or elements of it).

To me, the schemas are classes, the representations are object instances (in the OO sense), and the resources are abstractions, the details of which are known only to the server.

So, schema, instance representation, resource instance, and resource type/class are the four key concepts here. They're not really interchangeable, although you can usually just say "resource" and it's usually clear enough from context whether it's a specific abstract book or the general abstract notion of books. On the other hand, we found that blurring the line between resource and representation was extremely confusing. The "manipulation of resources through representations" interface constraint of REST is a big clue that we should keep these concepts separate.

All resource instances of the same resource class should use the same schema for their representation (pluralize all of that if the resource class supports multiple representations). However, representations using the same schema may represent instances of different resource classes.

For instance, I might consider "publisher" and "distributor" to be two separate links from the "publication" resource class to the "company" resource class. In this view, they're both companies and the only thing that distinguishes them is the relationship to the publication.

However, I might consider publishers and distributors to be different concepts, and even though they are both represented in the same way initially as companies and therefore use the same schema for their instance representations, I might want to define them as separate resource classes to facilitate evolving them separately later on. Possibly even factoring out a company resource class later that publisher and distributor would both use.

Does that help or is it just more confusing? I'm glad you brought this up as I agree that it is critical for mutual understanding.

JSON Hyper-Schema is, unsurprisingly, schema-oriented :-) This makes it fairly representation-oriented as well, as the schemas are generally describing representations (either their structure or their hypermedia controls).

JSON Hyper-Schema is not, by this definition, resource-oriented. It is extremely hard to pick out what abstract resources are involved in a system at a glance. A schema with a "self" link is a representation of a class of resources. If "self" uses a plain URI, there is one instance of the resource, which has a representation matching the schema. If "self" uses a URI Template, there are many instances of the resource, each of which has a representation matching the schema.

But resource classes (and instances) are also identified by URIs or URI Templates in links other than "self" links. If the resolved URI matches another schema's self link, then those resources are the same and the representation is described by the schema containing the self link. But if the resolved URI only appears in a non-self link, then that is an additional resource, and the only hint we have for its representation is the non-authoritative targetSchema. If targetSchema is present at all.

Furthermore, figuring out whether a resolved URI Template in a non-self link matches some other self link's URI Template is non-trivial, which obscures the connections among resources. This is true whether we're talking about resource classes or resource instances.

If you still feel that hyper-schema is resource oriented, I would like to understand how you figure out the set of resource classes involved in a system, and the relationships among them, either at a glance as a human, or through a simple programmatic algorithm.

\ There are also transient anonymous resources like error responses and many POST responses which have no self link or Content-Location header and therefore no identifying URI, but that's not important to the main point of this comment.

handrews commented 7 years ago

@jdesrosiers to connect this back to the example of the other format:

As @awwright noted earlier, we could deal with a lot of the when-to-use-the-represenation point by making assumptions based on HTTP. And in fact there was talk of adding a keyword like "operations": [get, set, delete] etc. that would indicate that those operations were supported in whatever way was standardized for the protocol in the URL scheme.

One reason we did specify them separately at first was that we also needed to describe APIs which abuse or under-utilize HTTP in various ways. Which, let's be honest, are probably more common than APIs which fully and correctly use HTTP, media types, etc. etc. I think it is important for hyper-schema to address all the things which are actually done in the wild, and not just the "correct" ideal. Although we should make the correct thing to do the easiest thing to do.

awwright commented 7 years ago

do you agree that JSON Hyper-schema, as it stands today in Draft 04, encourages this duplication?

I don't believe it does. "method" is an optional property in JSON Hyper-schema, and not present at all in most hypermedia technologies that have even fewer relation properties than JSON Hyper-schema provides.

Siren is the only other technology I'm aware of that lets you specify an arbitrary method, and only because it explicitly tries to copy much of HTML. I haven't seen an instance in Siren where people are listing otherwise-identical links for every method that the target resource supports.

Links are used for learning about new resources, so regardless of what JSON Hyper-schema defines, defining links multiple times doesn't achieve any additional effect for hypermedia purposes.

instead of the examples that are in the spec is frustrating.

Well, I'm asking for an example of what JSON Hyper-schema would look like ideally, and I'd like to see what you think that looks like.

The ultimate goal here is to assess which features are used most, and which features can be changed easily. Or if we need to make this assessment at all.

The examples you've provided suggests there's a lot of custom hacks and vocabulary, and not much attention would be paid to potential changes. Is that about right?

handrews commented 7 years ago

@awwright said:

The examples you've provided suggests there's a lot of custom hacks and vocabulary, and not much attention would be paid to potential changes. Is that about right?

No! Ugh. I regret putting those examples up and wish I could just rip this whole sub-thread out of this issue. The example with the resources section is from a company that I haven't worked for in nearly a year and a half, and I wasn't doing much with that project for the last few months I was there. And I have no current influence over it (I don't even know if they're continuing to expand it).

The reason I'm not putting an ideal version up is that you and I are not having the same conversation here. This is not an accusation, just an observation. I'm not entirely sure what to do about it.

I don't want to talk about solutions yet because there are clearly disagreements on the problems. So there's no point in proposing solutions. No one else seems to see the same problems, or in many cases have any idea what I'm talking about. Again, an observation and not an accusation- I am just apparently failing to convey what I need to convey.

I will think on it and try again later. Right now I'm a bit too dispirited.

I am hoping that you and/or @jdesrosiers will comment on my post about resource classes vs resource instances vs instance representations vs schemas. If we can come to an agreement on a theoretical basis (whether it is those four concepts or something a bit different), that would be a common point from which to start. Right now, we're talking past each other.

Anthropic commented 7 years ago

I think you guys need to have a Hangout chat, face to face for ten minutes to help adjust to each other's communication styles and expectations while ascertaining what the actual goals of the conversation are and next steps to solve them.

@awwright I still think a planning wiki, or even better, planning Google Docs, are a good idea. Context helps and annotated references can segment discussion and avoid ginormous comments like this thread is spawning.

Much of this is becoming either TL;DR or not clear enough to contextualise. I've found it's a common occurrence when you get a bunch of nerds like us in a room with no BA & PM to maintain focus :laughing:

slurmulon commented 7 years ago

@awwright @handrews @jdesrosiers wow, lots of moving parts here. I will definitely get back with some examples and comments, but I think @Anthropic is right - perhaps a different forum for conversation would help streamline things here.

Is there a Gitter set up for JSON Hyper-Schema? Not unable to find anything from a quick search, but I think that would help as it's real-time chat with persistent messages, like Slack.

handrews commented 7 years ago

@awwright I was considering splitting some threads out to separate issues, or starting threads on the google group (which doesn't seem to get much use). What would you prefer? Also happy to chat through some other system.

@Anthropic From digging through the old wiki, I'm not a big fan of using GitHub's wiki for planning- AFAIK there isn't really a good annotation and commenting system for it. I would be more in favor of Google Docs if we move to anything other than GitHub issues.

Anthropic commented 7 years ago

@handrews I definitely prefer Google Docs, being able to annotate directly on a sample schema is very helpful.

jdesrosiers commented 7 years ago

OK, @handrews, it sounds like we are using the same vocabulary after all. But, there does seem to be a difference between how we think Hyper-Schema fits into the REST architecture.

No one else seems to see the same problems, or in many cases have any idea what I'm talking about.

Yep. We're talking in circles. I think each of us has a different way of using Hyper-Schema and it's difficult for us to see other people's issues from our perspectives. Here is a great example.

If you still feel that hyper-schema is resource oriented, I would like to understand how you figure out the set of resource classes involved in a system

I wouldn't. The authoritative reference on what relations a resource has is the resource itself. A REST system has a reactive and dynamic quality to it. Documenting the system as a whole doesn't make sense to me. You learn about the system by browsing it. Each resource you get has it's own documentation.

I still think Hyper-Schema is resource-oriented. Using representations to interact with resources is how REST works and is what I mean by resource-oriented. I still think you are too concerned with a concept of classes which is a concept that doesn't exist in REST. Sorry I don't have time for a better explanation. There is one way that I think Hyper-Schema could be more resource-oriented. Currently, it only works with JSON and JSON-Schema. I don't think there is any reason why the hypermedia controls it describes couldn't apply to an XML instance described by an XML Schema. I think it should be possible to generalize Hyper-Schema to work with other Content-Types.