ietf-wg-httpapi / mediatypes

Other
5 stars 4 forks source link

application/openapi+yaml fragment identifier considerations inherits from application/json not application/yaml #71

Closed MikeRalphson closed 4 months ago

MikeRalphson commented 1 year ago

Apologies I'm coming late to this repo and this may have already been discussed. Feel free to point me to an existing issue/comment.

@dret suggested I raise issues here.

https://github.com/ietf-wg-httpapi/mediatypes/blob/0d2cef8bf8e658b2d34ac8e48c3ff999ca827305/draft-ietf-httpapi-rest-api-mediatypes.md?plain=1#L164-L165

I can see why this might possibly be (because OpenAPI recommends not using YAML alias nodes because they aren't directly expressable in JSON, so only one fragment resolution method is required: JSON Pointer), but I can see this causing problems / adding complexity when (say) an OpenAPI bundling tool is resolving references which could be to OAS documents in YAML, plain YAML resources, JSON Schema in YAML etc.

One common consideration is GitHub raw. links, which I believe are always served as text/plain, requiring bundlers to "sniff" content to enable the correct parsing rules. Obviously you need to parse the YAML somehow in order to identify it, in order to parse it correctly so fragments can be extracted.

Essentially my question is why doesn't application/openapi+yaml inherit fragment resolution rules from application/yaml?

darrelmiller commented 1 year ago

RFC 6838 calls out that

media types that use a named structured syntax with a registered "+suffix" MUST follow whatever fragment identifier rules are given in the structured syntax suffix registration.

The YAML media type registration includes the suffix registration which says

Differently from application/yaml, there is no fragment identification syntax defined for +yaml.

A specific xxx/yyy+yaml media type needs to define the syntax and semantics for fragment identifiers because the ones in Section 2.1 do not apply unless explicitly expressed.

The OpenAPI specification states:

If the representation of the referenced document is JSON or YAML, then the fragment identifier SHOULD be interpreted as a JSON-Pointer as per [RFC6901].

I would think it confusing if the registration for OpenAPI+yaml suggests that yaml anchors are acceptable.

I do think the Fragment Identifier considerations in the registrations for application/openapi+json and application/openapi+yaml do need updating as RFC 8259 appears to say nothing about fragment identifiers /cc @ioggstream

handrews commented 1 year ago

@darrelmiller @ioggstream yes, RFC 8259 §11 specifically avoids defining fragments for JSON, and RFC 6901 §6 explicitly states that JSON Pointer fragments are not usable with application/json.

I've always been confused about the viability of saying things like:

If the representation of the referenced document is JSON or YAML, then the fragment identifier SHOULD be interpreted as a JSON-Pointer as per [RFC6901].

There are three cases:

This has become an increasingly vexing corner case, particularly as, outside of HTTP and similar systems, there is not always a clear way to associate a media type with a document in the first place. This is even more complex in AsyncAPI given the range of schema formats.

ioggstream commented 1 year ago

@MikeRalphson Welcome!

Please, see @darrelmiller reply: YAML registration doesn't limit openapi+yaml

@handrews

I've always been confused about the viability of saying things like: ...

FTR it's from OAS3.1, I agree that without content negotiation it's hard...

RFC 6901 explicitly states that JSON Pointer fragments are not usable with application/json

Can you please reference the section of RFC6901 above?

MikeRalphson commented 1 year ago

Hi @ioggstream & @darrelmiller!

YAML registration doesn't limit openapi+yaml

That's not much of an answer I'm afraid :grin: it's just an appeal to authority, but it's a reference to another Draft RFC which isn't finalised yet.

My point still stands - the complexity of parsing OAS documents composed of resources with mediatypes application/yaml, application/openapi+yaml, application/schema+yaml (if such a thing were defined) as well as text/plain, application/json, application/openapi+json and application/schema+json is not inconsiderable.

This is without considering that many existing bundling implementations simply use a YAML parser to parse both YAML and JSON, but now they're going to need different logic for fragment resolution for each.

Essentially my question is why doesn't application/openapi+yaml inherit fragment resolution rules from application/yaml?

My question also still stands and I've highlighted the relevant word. 😄 If OAS documents can't contain YAML aliases (I think that falls under the spec's RECOMMENDED wording, but it's not absolutely clear) then there isn't any problem in having an 'unused' fragment resolution rule for fragments beginning with *.

darrelmiller commented 1 year ago

@MikeRalphson Does the answer from #72 provide justification as to the "why"? It would seem unreasonable to force the requirement of alias resolution on every media type that is based on yaml, when as you pointed out, the aliasing information may not be preserved.

handrews commented 1 year ago

@ioggstream from RFC 6901 §6, bold italics added:

Note that a given media type needs to specify JSON Pointer as its fragment identifier syntax explicitly (usually, in its registration [RFC6838]). That is, just because a document is JSON does not imply that JSON Pointer can be used as its fragment identifier syntax. In particular, the fragment identifier syntax for application/json is not JSON Pointer.

ioggstream commented 1 year ago

@ioggstream from RFC 6901 §6, bold italics added:

In particular, the fragment identifier syntax for application/json is not JSON Pointer.

Oh, I think this different from

JSON pointer is not usable with application/json

Peace, R

MikeRalphson commented 1 year ago

@darrelmiller but we have (to at least some degree) proved that alias information can be retrieved, with some parser implementations.

What hasn't been addressed is the complexity issue.

ioggstream commented 1 year ago

@MikeRalphson

the complexity of parsing OAS documents composed of resources with mediatypes application/yaml, application/openapi+yaml, application/schema+yaml (if such a thing were defined) as well as text/plain, application/json, application/openapi+json and application/schema+json is not inconsiderable.

I agree. But this affects many linked resources that use generic media types (including json-ld, geojson, etc.) in all cases where the specific media type is not used.

without considering that many existing bundling implementations simply use a YAML parser to parse both YAML and JSON, but now they're going to need different logic for fragment resolution for each.

Registering and using the right media type for every application contributes to the interoperability of the system. Once all media types are registered, implementers will use the proper media type and things will become simpler.

As @handrews says, JSON doesn't default to json-pointers. To process with JSON Pointers you need to ensure beforehand that the resource is OAS (e.g. json-ld has its own fragment identifier).

OT: text/plain for an OAS document is weird :)

MikeRalphson commented 1 year ago

@ioggstream yes, raw.githubusercontent is weird.

ioggstream commented 1 year ago

@ioggstream yes, raw.githubusercontent is weird.

I agree and had issues with all kind of resources exposed as text/plain via raw.githubusercontent. I am not sure that the way folks use a platform cannot impact on standards, though. Maybe raw.gh will add media type support one day :)

handrews commented 1 year ago

@ioggstream

In particular, the fragment identifier syntax for application/json is not JSON Pointer.

Oh, I think this different from

JSON pointer is not usable with application/json

I don't understand your rather casual dismissal here, although perhaps I am misreading you. I do not know how else to interpret that line from RFC 6901 §6 other than as a directive to avoid using JSON Pointer fragments with application/json. What other meaning could it have? Particularly given that RFC 8259 §11 could have added fragment syntax but opted not to.

The reason that I bring all of this up is that I've gotten pushback on using JSON Pointer fragments with application/json and would like some sort of authoritative reference to use to counter that. Or some sort of well-supported evidence that it's OK to say things like "if it looks like JSON, then you can use JSON Pointer fragments with URIs, even if its media type is application/json or if there is no formal way to associate a media type with this sequence of octets."

I would really love to have an answer to this question that will hold up in formal situations.

RFC 3986 §4.5 has this to say about the validity of fragment semantics (bold italics added):

The semantics of a fragment identifier are defined by the set of representations that might result from a retrieval action on the primary resource. The fragment's format and resolution is therefore dependent on the media type [RFC2046] of a potentially retrieved representation, even though such a retrieval is only performed if the URI is dereferenced. If no such representation exists, then the semantics of the fragment are considered unknown and are effectively unconstrained. Fragment identifier semantics are independent of the URI scheme and thus cannot be redefined by scheme specifications.

Individual media types may define their own restrictions on or structures within the fragment identifier syntax for specifying different types of subsets, views, or external references that are identifiable as secondary resources by that media type. If the primary resource has multiple representations, as is often the case for resources whose representation is selected based on attributes of the retrieval request (a.k.a., content negotiation), then whatever is identified by the fragment should be consistent across all of those representations. Each representation should either define the fragment so that it corresponds to the same secondary resource, regardless of how it is represented, or should leave the fragment undefined (i.e., not found).

darrelmiller commented 1 year ago

In particular, the fragment identifier syntax for application/json is not JSON Pointer.

@handrews I read this statement as the authors of the JSON Pointer specification clarifying that the creation of JSON Pointer didn't suddenly make it the official fragment identifier of application/json.

JSON Pointer fragments cannot be used with application/json without some specific context that permits it. For example, OAS specifically references the JSON Pointer specification to enable using fragments that point into JSON documents. I don't see a reason why a specific +json media type registration could not declare support for using JSON Pointer fragments to reference elements of that specific media type.

handrews commented 1 year ago

@darrelmiller a +json media type definitely can!

I think I got the conversation in this PR crossed with something else where people have argued that some other unknown document referenced by an API description file with a URI can use JSON Pointer fragments regardless of the target media type, as long as it appears to be JSON.

But if all @ioggstream and everyone else here is talking about is using it for +json or +yaml formats (and skimming back over that seems to be the case) then yes, I completely agree that that works.

ioggstream commented 1 year ago

I read it as:

  1. If the response is application/JSON, you cannot expect that the fragment Syntax is JSON pointer. For example, if it's json-ld, it uses rdf fragment.
  2. If you know that an application/JSON response conveys an oas document,l (Eg by inspection) nothing prevents using JSON pointers even if the conveyed media type is application/json

I think that anyway we are all on the same page: an oas document should be served using the application/OpenAPI+JSON media type and not the generic application/json

Implementations will improve in time.

handrews commented 1 year ago

@ioggstream I think your explanation makes sense, and point 1 aligns with how I read various standards. Do you have any reference for point 2 or is it just your personal sense of how things ought to work? I agree with 2, I just have never found any standards document that clearly justifies it. I have a reason for pushing this, but I'm going to open a new issue as it is beyond the scope of this one — I'll link it here (#75), please add your answer to the new issue.

ioggstream commented 1 year ago

My understanding is based on the following points:

MUST NOT is not that frequent. I will investigate further, though. Thanks for asking.

handrews commented 1 year ago

@ioggstream That makse sense to me. To be clear, I'm not trying to block anything here, it's more that I want to know if what I am talking about in issue #75 is possible. Because if so, we might want to do that as part of the API media types work.

MikeRalphson commented 1 year ago

@darrelmiller @ioggstream See discussion here re: whether YAML anchors and aliases are explicitly allowed or disallowed by OpenAPI 3.x.