OAI / oascomply

Apache License 2.0
22 stars 5 forks source link

Improving semantics around reference objects on the validation schemas #55

Open shtaif opened 1 year ago

shtaif commented 1 year ago

I'd like to propose a suggestion for improving some of the semantics around reference objects as expressed on the various specs' JSON schema documents. If this would sound compelling, I'd happily contribute an appropriate PR. Please raise any concerns that pop up! πŸ™πŸ» So far we've been using such solution beneficially for a while here on Rapid to improve validation experience.

Consider for instance the current Reference object definition from the v3.0 schema: https://github.com/OAI/OpenAPI-Specification/blob/24088857068802d5fd4e4f8bef3f95ef06fa8830/schemas/v3.0/schema.json#L54-L65 It specifies the $ref property as a general URI-formatted string.

What I'd like to propose is a change that would help verify that the type of destination a local reference object is pointing to - is actually the correct kind of definition it's expected to reference - for example, be able to detect as an error a case where a local reference object in place that is expected to reference a response is in fact referencing a schema.

We should be able to implant these semantics inherently in the JSON schemas by splitting the current sole Reference object definition into several more specialized definitions corresponding to every kind of reference we could have in a given specification version (e.g responses, request bodies, parameters...) - all of which will basically resemble the current existing one, except each's $ref property will be enhanced by an extra pattern keyword with some carefully crafted regex. The regex will ensure that, for the case of a reference of a response for example - the pointer string will have to be in a form that can refer only to locations where response definitions could actually be found.

This could something like this for start, replacing the current generic Reference definition:

"ResponseReference": {
  "type": "object",
  "required": [
    "$ref"
  ],
  "patternProperties": {
    "^\\$ref$": {
      "type": "string",
      "format": "uri-reference",
      "pattern": "{{{ a regex for strings in the form #/components/responses/* }}}"
    }
  }
},

"RequestBodyReference": {
  "type": "object",
  "required": [
    "$ref"
  ],
  "patternProperties": {
    "^\\$ref$": {
      "type": "string",
      "format": "uri-reference",
      "pattern": "{{{ a regex for strings in the form #/components/requestBodies/* }}}"
    }
  }
},

"SchemaReference": {
// ...

/*
  (of course, each pattern will be made to also cleanly pass for ANY file or remote references,
  as for these we obviously cannot statically determine the destination type)
*/

And respectively with this, every place across the entire schema that is currently defined as "X or possibly a reference" - would be edited into sort of a "X or possibly a response/parameter/example*|...-reference", and so on, by mentioning any of the specialized reference types illustrated above.

Even though this improvement can only benefit local references and not external ones - the former are so basic and frequent enough that the common mistakes and misunderstandings this could save should make it worth something.

Thanks for reading! How appropriate or feasible does this sound like? :)

MikeRalphson commented 1 year ago

I get the enhancement you're shooting for here, but using regexes sounds complex to maintain and potentially brittle. Currently $refs are allowed to point anywhere in the document or subdocument, and subdocuments don't even have to be structured like an OpenAPI document. External $refs are common and complex. How would this be achievable in practice?

handrews commented 1 year ago

This use case will be addressed by the OAS compliance parser project (which does not currently have its own repository). See Appendix D in that document for the currently funded scope for work in 2023.

Semantic validation technologies are better suited for this sort of relational enforcement than JSON Schema.

char0n commented 6 months ago

Hi everybody,

Even though this improvement can only benefit local references and not external ones

I don't think it's possible to distinguish internal or external references just with regex without further context.

OpenAPI 3.0.x spec says:

Relative references used in $ref are processed as per JSON Reference, using the URL of the current document as the base URI. See also the Reference Object.

This IMHO means, that if you have following reference: $ref=https://example.com#/components/schems/schema1, and the URL of the current document is https://example.com, then the reference is actually an internal one defined in absolute form and not an external one. It will become an external one, if the URL of the current document changes to e.g. https://google.com/.