API Doc vocabulary scope and goals

handrews commented 7 years ago

JSON Hyper-Schema is a hypermedia media type. Therefore it defines links for a given document, but it does not have a concept of an API, any more than HTML has a concept of a web site. As a hypermedia format, Hyper-Schema is concerned with runtime flexibility, one document at a time. It also aims to be URI scheme / protocol-neutral, although some nods to the prevalence of HTTP exist.

There are numerous API description formats: OpenAPI and RAML are two of the more popular. While both of them use JSON Schema in some way, they only use the validation vocabulary, not hyper-schema. These sorts of formats emphasize static description, and tend to be HTTP-centric.

The goals of a JSON Schema API Documentation vocabulary could include:

support documentation styles ranging from fully static to relatively dynamic
additional keywords to link schemas statically into an API
orthogonal API concerns such as auth
~~protocol usage, specifically HTTP header usage and response workflows~~ [EDIT: probably not, se below]
error documentation
request/response example usage beyond just example instances

While Hyper-Schema is concerned with who can provide authoritative runtime information, the API Documentation vocabulary can take a different philosophical approach of documenting expected behavior, rather than asserting an authoritative runtime description.

This is an initial discussion issue, which will stay open until the scope and goals feel clear enough to put into a persistent document in the repository.

Relequestual commented 7 years ago

Here you say..

support documentation styles ranging from fully static to relatively dynamic

And in the readme you say...

Unlike OpenAPI, RAML, etc., this format will be strictly complementarly to JSON Hyper-Schema, and assume its use as the primary hypermedia approach for the API being described.

I'd would invisiage that an API Docs vocab would allow for the describing of non hypermedia based APIs. Although I do consider the fact that we could be reinventing the wheel given OpenAPI and RAML.

handrews commented 7 years ago

Although I do consider the fact that we could be reinventing the wheel given OpenAPI and RAML.

This is what concerns me. Someone somewhere asked why we should do anything in this area at all given the popularity of OpenAPI in particular. I think it only makes sense to invest in this if there is a reasonable target "market" for it. There is a definite gap in documentation solutions for true hypermedia APIs.

Those other formats have some proposals, but they also have challenges because static documentation is more restrictive than hypermedia. It's hard to "open up" a static system for dynamic use. It's easier to allow "closing" a dynamic system by overlaying a static description of its likely behavior. From a hypermedia perspective, this all stays non-authoritative, like targetSchema. But of course API publishers can add their own guarantees as much as they want.

As I've been kicking some ideas around I think it is reasonably possible that we can address non-RESTful HTTP APIs, or at least some large subset of them. But I'd rather prioritize solving the problems that OpenAPI and RAML are not solving, or at least not solving all that well.

handrews commented 7 years ago

Now that we're moving hypermedia topics along for draft-07, I've put more thought into where the line between Hyper-Schema and API Documentation should go.

I'm going to focus on HTTP (and by extension, CoAP), but all of this needs to be considered in the context of non-HTTP links as well.

Scope Principles

Resource vs API

Things that are within the scope of a single resource and its behavior are in the scope of Hyper-Schema.

Things that connect resources into a larger unit such as "an API" are within the scope of API Documentation.

Generic hyperclient vs application code

Things that a generic hypermedia client (a.k.a. hyperclient) needs to handle should be covered by Hyper-Schema.

Things that require application-specific handling should go in the API documentation. The "A" is for "application", after all :-)

Protocol header usage

Guidance on using protocol headers for interacting with a single resource feels to me like part of Hyper-Schema.

JSON Home uses the following criteria for this:

Generally, [resource hints] ought to be information that would otherwise be discoverable by interacting with the resource.

This includes both actual response headers that could be discovered with a HEAD or OPTIONS request, and indication of valid and/or necessary request header values that would otherwise only be discovered by attempting their usage and noting success or failure.

See json-schema-org/json-schema-spec#296 for specification work in this area.

Response codes and payloads, including errors

This is very much about connecting resources to each other. We have "targetSchema" and "mediaType" to describe the target resource's representation, but a resource may send other information in responses.

I view all responses in a RESTful system as resource representations, falling into one of the following categories:

Target representation (successful GET responses, and responses where Content-Location is equal to the request URI). These are handled by "targetSchema" and "mediaType" and won't be discussed further here.
Representations of other identifiable resources (Content-Location is set to something other than the request URI; possibly the same as Location but also usable without Location or with a different URI)
Representation of anonymous resources, including errors and non-persistent indications of processing results (e.g. non-resource-creating POST operation responses)

As far as Hyper-Schema is concerned, each response indicates its own schema.

HTTP status codes mean what they mean, so a generic hypermedia client does not need an indication of which are expected: it must handle statuses generically whether they are expected or not. The media type of the response will indicate what refinements on the HTTP status might be present (for instance, application/problem+json for self-describing error refinements, or regular JSON with a schema indicating how to interpret a processing status response). This is why per-status response schemas are not needed in Hyper-Schema. The detailed contents are an application level concern.

However, documenting response expectations for humans is useful. A hyperclient implements generic processing, but application code needs to make trade-offs between development costs and flexibility in the face of different responses. Documentation provides guidance for that trade-off, and indicates how to handle likely contingencies.

Auth

While some aspects of auth are covered by documenting header usage, there is more to the topic than that. Additionally, auth is typically defined across an API, so examining each resource separately for auth behavior is impractical.

Auth does have runtime behavior that a hyperclient could support, but my current inclination is to follow the lead of Python's "requests" library and defer that to some sort of plug-in architecture. While headers may be involved in auth, documenting OAuth usage, for instance, is definitely not part of Hyper-Schema.

Workflows

Documenting a hypermedia system should focus on links rather than requests/responses, and matching common use cases to paths through multiple API calls using links is something that cannot be done at the individual resource level.

While not essential to produce a useful API Doc vocabulary, this is something that would be hypermedia-oriented and work well with Hyper-Schema in ways that OpenAPI does not.

@dlax @geemus @philsturgeon @imvenkat @tajo this may be relevant to your interests

geemus commented 7 years ago

Looks broadly reasonable. I certainly understand the tensions between full dynamism and full documentation, but I also don't want to have to do the two things completely separately (as drift becomes rather problematic). Also, I rarely see fully dynamic APIs anyway, so the more static/documented version probably is more likely to reflect reality at least at present.

handrews commented 7 years ago

@geemus as far as the dynamic stuff goes, I'm primarily interested in the workflows idea. I will be doing work in that area regardless, so we can see how that goes and "promote" things here if/when they are shown to work.

I think the other categories I listed are pretty compatible with static approaches as well. Connecting up responses could go either way, but to me that means that it should be allowed to go either way. OpenAPI is focused on locking things down to a particular static notion, which need not match HTTP semantics. So I feel like we should focus on flexibility and optimizing for APIs that match (or come very close to) HTTP semantics.

Relequestual commented 7 years ago

This sounds reasonable. I previously had real issue trying to understand what HyperSchema supported and what it did not when trying to use it. It would be great to put together some examples of APIs that would and would not be supported by JSON Schema API Doc, and which would not and why. That would also be a chance to demonstrate the power of dynamic definitions which isn't available elsewhere.

Essentially, I'm keen that we communicate how and why this is different to OAI, RAML, Swagger, etc, to avoid the barrage of questions and avoid potential incorrect usage. I feel doing so could also strengthen the case for HyperSchema generally.

json-schema-org / json-schema-vocabularies