Provide an indication of whether a type URL is designed to be dereferenced for documentation

pimterry commented 3 years ago

Pulling this conversation out of #11:

The current spec allows any type URLs to be used, and there's no strong requirement that they link to documentation, it's just encouraged.
As an API client, it can be useful to be able to automatically find human-readable documentation relevant to an error response.
Right now, the only way to do that is to dereference every type URL to see if it goes to a documentation resource, but a) that's discouraged by the spec (type URIs should not be dereferenced automatically) and b) it's not very efficient for anybody involved.

As an example, Zolando's API design guidelines specify that:

Problem type and instance identifiers in our APIs are not meant to be resolved. RFC 7807 encourages that custom problem types are URI references that point to human-readable documentation, but we deliberately decided against that, as all important parts of the API must be documented using OpenAPI anyway. In addition, URLs tend to be fragile and not very stable over longer periods because of organizational and documentation changes and descriptions might easily get out of sync.

All their API responses use absolute path types to deal with this, like "type": "/problems/out-of-stock", which intentionally go nowhere. This is allowed by our current spec, but makes life difficult for those of us who want to automatically find documentation from arbitrary problem detail responses.

It would be useful to provide an explicit way to link to documentation in these cases, or to make it clear that no documentation is available.

serialseb commented 3 years ago

With identification using URIs, you never quite know if it is a document or just a reference to a concept. Any person that followed httpRange-14 back in the days knows that this is hard. That said, the scenarios are multiple:

Identification only relies on expanding a uri to its full form (from a basue uri for any uri), and checking it against a known value
can i learn anything from it, which is a "follow your nose" first principle: click on it, see if anything comes out

If you were to add a "documentation" uri, i have no more guarantee that anything usefull will come out of it rather than following my nose to the original type, so I'm not sure Im gaining a lot from a predictability perspective.

If the main operator is human, then a documentation URI would only make sense because it overrides the main URI, which would only be done if said URI is not in control of the operator that published said URI. This seems unlikely to me, but I could see a scenario where you want, on your own API, to provide your own documentation for that type above and beyond from the one that would be provided by the provider of that error. I'm not sure what i think about that capacity, to say that "in our realm, we don't do things the way they said, we got our own". It's an open question for me.

If the main operator is a machine, then just a URI seems insufficient, especially as the override i mentioned above would be needed per type of automate-able documentation one could retrieve.

sazzer commented 3 years ago

If the main operator is human, then a documentation URI would only make sense because it overrides the main URI, which would only be done if said URI is not in control of the operator that published said URI

In that case, I wonder if a Link header - with rel="help" maybe? - would be a better way of achieving that? That way the actual problem payload doesn't change just because some server wants to indicate that human-readable documentation about the problem exists somewhere different to the provided URI.

dret commented 3 years ago

jumping in to hopefully make some progress with this:

"automatically" following type links (as mentioned by @pimterry) isn't a great idea, since this may cause the well-known problem of overwhelming the server of the type URI. that's why RFC 7807 says that "Consumers SHOULD NOT automatically dereference the type URI."
instead of adding a specific "hint", we could add language saying that if you want your type URIs not to be dereferencable, just create them that way, for example by using URNs (or similar URI schemes). that way, we could one the one had provide guidance for those who want to create dereferencable URIs (make them absolute) and for those not wanting dereferencable URIs (do not use relative URIs but instead use URI schemes that have no associated dereferencing mechanism, such as URN URIs).

mnot commented 3 years ago

we could add language saying that if you want your type URIs not to be dereferencable, just create them that way, for example by using URNs (or similar URI schemes). that way, we could one the one had provide guidance for those who want to create dereferencable URIs (make them absolute) and for those not wanting dereferencable URIs (do not use relative URIs but instead use URI schemes that have no associated dereferencing mechanism, such as URN URIs).

So, what if I create a problem type that I originally intend not to be deref'd, but down the road I need it to be?

It seems to me that the harm in always using dereferenceable URLs is pretty minimal/manageable --

People who try will get 404's (for HTTP URLs)
Domain name ownership is a thing (for schemes based on DNS)

So an alternative would be to recommend using HTTPS URLs in most cases, with a proviso that sometimes they won't be resolvable (something a person can discover at their convenience), and that for standard / widely-used codes, a HTTP URL that's stable (possibly provided by IANA, as per #7) is wise.

Related to #13.

pimterry commented 3 years ago

So an alternative would be to recommend using HTTPS URLs in most cases, with a proviso that sometimes they won't be resolvable (something a person can discover at their convenience)

We can recommend this, but it seems likely that most problem types will still intentionally not be resolvable, unless it's a hard requirement. The Zalando style guide specifically encourages that it should do so, and discussion elsewhere - it seems that API developers (for some good reasons, e.g. uncoupling type ids from doc URLs) want to use unresolveable type URLs, independent of their documentation URLs.

As a tool author, that means I'm going to have to automatically dereference every problem details type URL every time I receive such a response. That's not great for anybody.

I want to automatically find a working URL for problem documentation, every time a problem is received. In my case, I'm working on an HTTP debugger where that's especially relevant, but you can imagine many other use cases, e.g. in low-level libraries or clients, perhaps an HTTP library that throws clear friendly exceptions to help with debugging every time it receives a problem details response, something like:

Request to https://example.com/your-url failed unexpected with status 403:

You do not have enough credit. Your current balance is 30, but that costs 50.

For more information, see the https://example.com/probs/out-of-credit

The problem is in the last line. If that goes to a 404 (likely to happen in many cases) it's a very poor user experience for the end developer looking at their error logs, who's just been told that it's working documentation. Tools/libraries will reasonably want to avoid that experience as much as possible - the only current way to do so is to dereference every problem URL when it's received, or to never try to link to documentation at all.

If this is a use case we just don't want to support in this RFC then that's not totally unreasonable, and we could just close this. It seems like we're very close to machine-discoverable problem documentation here though, and personally I'd find that very useful.

what if I create a problem type that I originally intend not to be deref'd, but down the road I need it to be?

This is another separate good reason why imo it'd be useful to have a explicit and independent way to link to problem details documentation from a response (e.g. a link header, a separate field, etc etc)

sazzer commented 3 years ago

Here's an interesting detail. If there was some way to determine that a problem URI was resolvable, and it resolves to a redirect then what does that mean?

I ask this because the comment I was originally going to make was about dead documentation links. I could make a problem URI today that is resolvable, but there's no guarantee that it will still be resolvable in the future. Or, worse, it might just resolve to the wrong thing.

And that then got me thinking about other ways that URIs can resolve to the wrong things, and redirects seems a particularly awkward one.

I would assume that it doesn't mean anything, and that the URI as written in the payload is the one to use. But equally, I can see an argument that two different URIs that both resolve as redirects to the same thing could be considered as the same problem, and that just makes life awkward.

asbjornu commented 3 years ago

Performing HTTP redirection from the type URI to documentation is actually what I recommend, since what's used in problem+json is supposed to be a stable identifier, while the URI of a documentation site may be generated by a CMS and thus probably can't be considered stable.

I've never even considered that a URI comparison implementation could perform an HTTP request against the two URIs and see whether they, after HTTP redirection and whatnot, resolve to the same URI.

Using resolvable URIs that perform HTTP redirection as identifiers is common. The whole of Dublin Core does it, for instance: http://purl.org/dc/terms/abstract redirects to https://dublincore.org/specifications/dublin-core/dcmi-terms/#abstract . The former is the canonical identifier, the latter is the documentation about the identifier.

mnot commented 3 years ago

Whether or not it resolves to a redirect doesn't change the problem identified by the type URI. If folks believe otherwise, please open a separate issue.

@pimterry I see what you're saying regarding tool user experience. I'm not against this, I could see us adding a member with a boolean value whose semantic is 'here's a hint that the type URI is resolvable'; I could also see us adding a member with a separate URI for documentation.

However, two things come to mind:

Even if the problem says that the URL is resolvable, it still might not be (e.g., network failure, server-side issue, etc.).
Some problems are going to omit this signal even when the URL is resolvable -- either because they don't know about the signal, or they don't care.

Both of these reduce the quality of the information that the signal conveys. What I think we need to establish is whether it still conveys enough information to be worth the extra effort and complication.*

If we don't have this sort of signal, I think a tool such as the one you're talking about will need to present it as something like:

Request to https://example.com/your-url failed unexpected with status 403:

You do not have enough credit.
Your current balance is 30, but that costs 50.

More documentation might be available at: https://example.com/probs/out-of-credit

Alternatively, you could attempt to resolve the URI and then cache the result, avoiding the need to make a request for every problem instance. To facilitate that, it might be interesting to put some bounds on how cacheable these type URIs are...

Either way, we should qualify the 'automatically' prohibition against resolving the URI to make it applicable to runtime resolution, not debug time.

Thoughts?

* I find that an often under-appreciated aspect of APIs is the cognitive load they place upon users; if there are too many options, they're less likely to use what's available well.

mnot commented 3 years ago

One other aspect of this that we haven't discussed: if we're talking about adding a new member that's valid and standard across all problem objects, it violates this statement in 7807:

Note that because extensions are effectively put into a namespace by the problem type, it is not possible to define new "standard" members without defining a new media type.

So if we decide that this is a must-have, we'll need to either define a new media type for problem types, or convince ourselves that defining such an extension won't break currently deployed extensions (without being able to know their breadth).

mnot commented 3 years ago

Discussed in 111: default resolvable is good. Not a strong motivation for a 'doesn't resolve' flag; close with no action (except perhaps prose).

pimterry commented 3 years ago

Discussed in 111: default resolvable is good

Sounds good to me.

If we can strengthen the spec from gentle encouragement to "if the type is dereferenceable, it SHOULD resolve to documentation" (or similar) and qualify the "Consumers SHOULD NOT automatically dereference the type URI" text as discussed above, then my use case above should work reliably with no extra fields required, which would be great.

This then ties into #13 & #21: we should give a clear recommendation for an alternative, for those who do want explicitly non-resolvable types.

Does all this apply the same to instance? That currently doesn't state a position on this at all: "It may or may not yield further information if dereferenced"

dret commented 3 years ago

On 2021-07-28 12:21, Tim Perry wrote:

Does all this apply the same to |instance|? That currently doesn't state a position on this at all: "It may or may not yield further information if dereferenced"

wouldn't that be odd? the type URI probably mostly will be used for comparing to some known type URIs. the instance URI on the other hand probably most often will provide access to info analyzing the specific reasons for the problem, no?

pimterry commented 3 years ago

It's just that the same ambiguity applies: is http://example.com/failed-transactions/123 intended as a link to a resource the receiver should visit for more information about that specific transaction, or intended only as an opaque identifier?

It's quite plausible that the situation isn't the same as type, and that's OK. It's not really as important for tooling AFAIK (I imagine tools are all more likely to be interested in linking to human-readable documentation than the failing resource itself, which could be anything) and I don't have a strong opinion either way.

If we do think both should be treated similarly (i.e. both should be resolvable by default) then that's neat and we can use consistent recommendations throughout. If not then that's fine too - we just need to make it clear that they're different (and if this is the case, I think the current text does that OK).

Either way I think an answer #13 for instance as well as type would be nice, so that it's clear what to do if you want either to be explicitly not resolvable, but that's just offering an optional suggestion, not a constraint.

mnot commented 3 years ago

See proposal in #26 - thoughts?

ietf-wg-httpapi / rfc7807bis

Provide an indication of whether a type URL is designed to be dereferenced for documentation #15