Proposal: x-oas-draft-alternativeSchemas

darrelmiller commented 6 years ago

A long standing request has been to add support for alternate schemas. See OAI/OpenAPI-Specification#764 OAI/OpenAPI-Specification#1443

There are many concerns about the effect this will have on the tooling ecosystem, however, there is also a strong desire from the community to support both more recent versions of JSON Schema and other formats such as XSD Schema, Protobuf Schema.

Instead of simply adding this feature to the specification and hoping that tooling vendors will implement it. I propose we use the draft feature (PR OAI/OpenAPI-Specification#1531 ) process to validate the value and feasibility of this feature.

openapi: 3.0.2
info:
  title: A sample using real JSON Schema and xsd
  version: 1.0.0
paths:
  /:
    get:
      responses:
        '200':
          description: Ok
          content:
            application/json:
              x-oas-draft-alternate-schema:
                type: json-schema
                externalValue: ./rootschema.json
            application/xml:
              x-oas-draft-alternate-schema:
                type: xml-schema
                externalValue: ./rootschema.xsd

The property is called alternate-schema because it can be used in instead of, or in addition to the OAS Schema object. If both are present then both schemas must be respected.
The type field is required and must match one of the values identified in an alternate schema registry which will be created as part of this proposal. The alternate schema registry will provide a link to a specification for the schema file.
It is recommended that tools provide users with warnings when they encounter an alternate schema type that they do not support. Tools should not fail to process a description, unless the schema is essential to the operation. Ideally they should continue as if the alternate-schema is not present in a form of graceful degradation.
The externalValue property is required and must be a relative or absolute URL.
The alternate-schema property can be used anywhere the schema property can be used.

darrelmiller commented 6 years ago

@philsturgeon You don't see the $ref issue of having to keep the $ref'd schema in sync with alternate schema externalValue to be problematic? Especially if the $ref'd OAS schema is actually in another file.

tedepstein commented 6 years ago

I’m really liking the level of integration afforded by making alternative schemas a property of the Schema Object, especially the possibilities of using Boolean assertions for composition: allOf, anyOf, oneOf, etc.

I posed bunch of questions and concerns in previous comments, before I had fully taken in this new idea. Now, I actually think I can go back and answer many of my own questions, using the new structure.

handrews commented 6 years ago

@darrelmiller oneOf cannot be evaluated in the way that you mention, as you need to verify that exactly one alternative works. anyOf can be short-circuited for assertion processing, but not for annotation processing. You know that the assertion result is true as soon as one subschema validates, but for applications that care about annotations (and code generation will be one such application), you need to examine all subschemas as all annotations from valid subschemas must be collected.

If you want a true "stop at first usable schema" approach then that needs to be an OAS extension keyword. Redefining the behavior of oneOf/anyOf would be very confusing.

You don't see the $ref issue of having to keep the $ref'd schema in sync with alternate schema externalValue to be problematic? Especially if the $ref'd OAS schema is actually in another file.

I'm confused on this- aren't all of the alternate schemas in external files? Why would the OAS schema be any harder to keep in sync? I may have missed something here...

handrews commented 6 years ago

The $ref would imply that it's a JSON Reference, which is not necessarily what we want in all cases

In draft-08 we are redefining $ref from "logically replace this with the target" to "this has the same results as evaluating the target", in part to ensure that you can $ref between schemas that use different processing rules. So OAS could adopt that definition for the OAS-specific schema when $ref is used with x-oas-draft-alternateSchema (to avoid breaking compatibility elsewhere).

tedepstein commented 6 years ago

@handrews wrote:

OAS cannot define fragment syntax and semantics, in a registry or otherwise. Only the media type of the target representation can define those semantics. JSON Schema defines semantics for JSON Pointer and plain name fragment syntax. XML defines fragment based on the XPointer framework. These are not re-definable.

OK, makes perfect sense that it's defined in the media type. As long as fragment resolution is defined somewhere, I'm happy.

tedepstein commented 6 years ago

Closing the loop on some of my earlier comments, to the extent that I think these are addressed by moving alternative schemas into Schema Object...

...can you include an OAS Schema Object (i.e. a "standard" OpenAPI schema) in alternateSchemas list, as a way to specify its order of preference with respect to other alternateSchemas?

Yes, using anyOf (or a new boolean logic assertion, TBD...)

anyOf:
  - alternative-schema:
      type:  jsonSchema 
      externalValue: ./myschema-draft-07.json
  - alternative-schema:
      type:  jsonSchema 
      externalValue: ./myschema-draft-06.json
  - type: object
      properties: 
        foo:
          type: string
     ...

If you don't do this, but you do specify both schema and alternativeSchemas, is the standard schema entry assumed to go at the end of the list, as the least preferred option?

IIUC, including an alternative schema in a Schema Object that also has assertions is equivalent to combining the two with allOf. These two Schema Objects should be equivalent:

Explicit allOf:

allOf:
  - type: object
    properties: 
      foo:
        type: string
     ...
  - alternative-schema:
      type:  jsonSchema 
      externalValue: ./myschema-draft-07.json

Implicit allOf:

  type: object
  properties: 
    foo:
      type: string
     ...
  alternative-schema:
      type:  jsonSchema 
      externalValue: ./myschema-draft-07.json

This was true before the refactoring of alternative schemas into Schema Object. The semantics of a Schema Object with assertions+alternative schema are the same as the previously specified semantics of using schema+x-oas-draft-alternativeSchemas.

But I think it's clearer now, for reasons I'll explain in the next point...

The property is called alternative-schema because it can be used in instead of, or in addition to the OAS Schema object. If both are present then both schemas must be respected.

When we say both schemas must be respected, that means the content must conform to both schemas in order to be valid. But it doesn't necessarily mean that implementations are required to validate against both schemas, or otherwise "use" both schemas. Is that right?

That's from the earlier design.

But now we have (or will have) a way to explicitly place a native OAS schema in a list, in preferential order relative to other schemas. The implicit and explicit allOf semantics are clearer now (to me, anyway) because they don't need to address the order-of-preference use case. That use case is handily covered by anyOf.

So it's clear that validators and other processors must always process all of the native OAS schema assertions, annotations, and/or alternative schema provided in an allOf scope, whether explicit or implicit.

tedepstein commented 6 years ago

@handrews,

oneOf cannot be evaluated in the way that you mention, as you need to verify that exactly one alternative works. anyOf can be short-circuited for assertion processing, but not for annotation processing. You know that the assertion result is true as soon as one subschema validates, but for applications that care about annotations (and code generation will be one such application), you need to examine all subschemas as all annotations from valid subschemas must be collected.

OK, I see that requirement specified here, and I understand the need for it.

Also, anyOf doesn't imply an order of preference, so processors could evaluate the subschemas in any order, and short-circuit validation after evaluating a subschema that is not the most preferred.

If you want a true "stop at first usable schema" approach then that needs to be an OAS extension keyword. Redefining the behavior of oneOf/anyOf would be very confusing.

I think this is important, both to allow more efficient validation, and to avoid the potential complexities of handling conflicting annotations from multiple subschemas.

But these concerns don't seem unique to OpenAPI. If we introduced a new firstOf or firstOneOf that did exactly what we're describing here, would JSON Schema consider adding it to a future draft?

tedepstein commented 6 years ago

Regarding usage: I'll withdraw this proposal.

I'm still not entirely comfortable that we have fully defined how native JSON Schema/Schema Object assertions and annotations are supposed to be applied to content that is not JSON, and does not have a well-defined, authoritative mapping to JSON (like YAML).

I'll try to describe those concerns in a later post or on Monday's call. For now, I just want to pull usage off the table, so @darrelmiller and @handrews can sleep better. ;-)

darrelmiller commented 6 years ago

UPDATED

This a proposal to add a new field called alternativeSchema to the OAS Schema Object. While still in draft, the field name will be prefixed with x-oas-draft-.

Schema Object

Fixed Fields

Field Name	Type	Description
x-oas-draft-alternativeSchema	alternative Schema Object	An external schema that participates in the validation of content along with other schema keywords.

Alternative Schema Object

This object makes it possible to reference an external file that contains a schema that does not follow the OAS specification. If tooling does not support the type, tooling MUST consider the content valid but SHOULD provide a warning that the alternative schema was not processed.

Fixed Fields

Field Name	Type	Description
type	`string`	REQUIRED. The value MUST match one of the values identified in the alternative Schema Registry.
location	`url`	REQUIRED. This is a absolute or relative reference to an external resource containing a schema of a known type. This reference may contain a fragment identifier to reference only a subset of an external document.

This object MAY be extended with Specification Extensions.

Examples

Minimalist usage of alternative schema:

schema:
  x-oas-draft-alternativeSchema:
    type: jsonSchema
    location: ./real-jsonschema.json

Combination of OAS schema and alternative:

schema:
  type: object
  nullable: true
  x-oas-draft-alternativeSchema:
    type: jsonSchema
    location: ./real-jsonschema.json

Multiple different versions of alternative schema:

schema:
  anyOf:
  - x-oas-draft-alternativeSchema:
      type: jsonSchema
      location: ./real-jsonschema-08.json
  - x-oas-draft-alternativeSchema:
      type: jsonSchema
      location: ./real-jsonschema-07.json

Combined alternative schemas:

schema:
  allOf:
  - x-oas-draft-alternativeSchema:
      type: xmlSchema
      location: ./xmlSchema.xsd
  - x-oas-draft-alternativeSchema:
      type: schematron
      location: ./schema.sch

Mixed OAS schema and alternative schema:

schema:
  type: array
  items:
    x-oas-draft-alternativeSchema:
        type: jsonSchema
        location: ./real-jsonschema.json

Alternative Schema Registry

Note this is a placeholder registry. Don't take the values seriously. The Alternative Schema Registry is located at https://spec.openapis.org/registries/alternative-schema. Inital contents of the registry include:

Name	Link	Description
jsonSchema
xsdSchema

earth2marsh commented 6 years ago

Minor nit: does "Value" add anything in the name externalValue? Every name has a value, ideally, after all. :)

Suggest externalSchema to better align with externalDocs. WDYT?

darrelmiller commented 6 years ago

@earth2marsh externalValue was chosen to match the way example object works. ExternalDocs is the containing object that has a url property. We could take that approach, but I feel like externalValue is more explicit.

tedepstein commented 6 years ago

@darrelmiller, in the example labeled "Combined alternative schemas", shouldn't it be allOf rather than anyOf?

I'm assuming this is supposed to work the same way as the previous "Layered Validation" example. The two schemas are complementary, covering different aspects of validation, and are intended to be used in combination.

I'm also assuming that the following rule would apply in allOf scenarios like this:

It is recommended that tools provide users with warnings when they are unable to find an alternate schema that they support. Tools should not fail to process a description, unless the schema is essential to the operation. Ideally they should continue as if the alternate-schema is not present in a form of graceful degradation.

So if the processor only supports some alternative schemas in an allOf group, it should process the ones it can, and warn on the others.

earth2marsh commented 6 years ago

@darrelmiller externalValue in examples complements the in-line value, plus it's actually an example value. In the case of alternativeSchema (an aside, wondering if altSchema would be worth considering?), the item being referenced, while it certainly has a value (as should any reference) it really is an external schema, hence my pref for externalSchema as the name.

handrews commented 6 years ago

and to avoid the potential complexities of handling conflicting annotations from multiple subschemas.

We are further formalizing how annotations are collected in draft-08, building on the work that is already present in draft-07. There are no problems with "conflicting" annotations, the collection process provides applications (in this case, OAS implementations) with enough information to disambiguate the values (or just keep them all, or whatever).

But these concerns don't seem unique to OpenAPI. If we introduced a new firstOf or firstOneOf that did exactly what we're describing here, would JSON Schema consider adding it to a future draft?

Unlikely. I'm not aware of this ever coming up before, and I have read a huge number of JSON Schema issue and feature requests. TBH I'm not even sold on the idea that OAS needs it. If you want to do validation only and not annotations, the spec already allows for short-circuit validation in such cases. If you want to do annotations, there is no conceivable way to short-circuit anyOf, contains, etc.

You can still short-circuit allOf on the first failure (as all annotations are dropped), and you can still short-circuit oneOf on the second successful validation (which fails overall, again meaning that all annotations are dropped). The annotation collection/dropping conditions will be more explicit in draft-08.

tedepstein commented 6 years ago

TBH I'm not even sold on the idea that OAS needs it. If you want to do validation only and not annotations, the spec already allows for short-circuit validation in such cases. If you want to do annotations, there is no conceivable way to short-circuit anyOf, contains, etc.

That's true for the intended anyOf use case. The logic says, for each subschema, if the instance validates against that schema, then include its annotations.

In our firstOf case, we assume that each of the schemas in the list is sufficient; so once the processor finds a schema it can understand, it doesn't need to visit other schemas in the list.

Clearly our use case is different. JSON Schema's boolean assertions (at least through draft-07) naturally assume the processor understands the entire schema language, so the conditional logic is all about whether the instance validates. In our case, we're asking a different question - whether the subschema is usable by the processor. If so, we use that schema, and no others. I don't think it's our intent to proceed to the next schema, if we've already found a schema we can use, and the instance has already failed validation against that schema.

IMO, this justifies introducing a firstOf assertion, rather than trying to repurpose anyOf. But I can understand that this might not be of interest outside of OpenAPI.

handrews commented 6 years ago

@tedepstein the use case is so specific that I wouldn't even call it firstOf, as that sounds like a thing that people could use in general or could be in the regular schema vocabulary. I'd call it schemaList or something else very tied to how it is being used and why it behaves that specific way.

zolyfarkas commented 6 years ago

Can we also add avro(https://avro.apache.org) as alternative schema's, and also a way to provide the schema inline (since avro schema's are json)?

Here is some detail on my use case:

We currently use avro schemas to declare the payload of our REST services. Avro schemas are JSON and as such we currently inline them inside the swagger JSON via custom x-avro-schema field. Every avro REST endpoint we have, depending on the Accept provided by the client (application/json or application/octet-stream) will return the content either as JSON or as AVRO binary, this is very important where the content is large (array) (binary payload can be an order of magnitude smaller than the json payload). Although you can have a limited conversion of a avro schema into a swagger model, the 2 formats do not have feature parity and probably never will, it would be really useful to have a more "official" way of declaring different schema types...

Here is a bit more detail on the use case:

schema's are developed in schema projects (maven), and can depend on each other for reusability. The maven repository is used to publish/distribute the schema's, language bindings, documentation and/or gh-pages.
A example REST service built with avro schemas.

Come to think of it, it might make sense to package and distribute open api specs with maven/maven repos as well... right now everybody seems to copy/paste these swagger definitions (like here). Publishing them to a maven repo would make things better...

adjenks commented 5 years ago

This looks great. Decoupling the schema specifications from one another seems like a wise choice. I would like to be able to inline the definition though. Otherwise I'd need to have a pile of small external schemas and that's not particularly convenient.

darrelmiller commented 5 years ago

@adjenks The problem with allowing inlining is that it would only work for JSON and compatible YAML formats. You don't necessarily have to have lots of small external files. You could use a single external schema file and use a JSON Reference fragment identifier to point into the external file.

fmvilas commented 5 years ago

If the external file is not JSON (e.g., Protobuf or XSD) you'll not be able to use JSON Reference fragment identifiers. Or you will if you put the values in strings but in this case why not allowing inlining as a string in the original OpenAPI document?

philsturgeon commented 5 years ago

I think the advice was meant to be: if you want to inline then it has to be JSON or similar. If that’s the case you can have one file and use fragments.

If that’s NOT the case you’re not going to be able to use fragments anyway, so better use solo external files or whatever.

-- Phil Sturgeon @philsturgeon

On Mar 25, 2019, at 08:16, Fran Méndez notifications@github.com wrote:

If the external file is not JSON (e.g., Protobuf or XSD) you'll not be able to use JSON Reference fragment identifiers. Or you will if you put the values in strings but in this case why not allowing inlining as a string in the original OpenAPI document?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

tedepstein commented 5 years ago

I also think it's really important to allow alternative schemas to live inside the primary OpenAPI document if that's what makes sense to the API designer. It will certainly be a lot more convenient and a lot more readable for certain users, using certain schema formats in certain usage patterns.

Like other reusable components (where we use $ref), and like example values (where we use value and externalValue), the API designer should be able to decide whether to keep the value inline or refer to an external resource.

In cases where the schema format is not YAML or JSON, they can be embedded as string values. This doesn't put any special burden on tools; extracting and processing a string property value is just as easy, maybe easier, than resolving a URL.

I think this is similar to the current example property and the examples --> [Example Object] construct, used in Parameter and Media Type objects. These are also designed to support JSON, YAML, and other formats. So the example property and the Example Object.value property are defined as type Any.

Example Object allows the URL-typed externalValue property as an alternative to value. I assume OpenAPI chose to separate these two cases instead of using $ref, because we generally use $ref only when it refers to one of our reusable objects in the Components Object. Components Object has a map of Example Objects. Adding a separate map of example values could be confusing, or just overkill, so we didn't do that, and didn't use $ref in that case. Is this more or less correct, historically speaking...?

It looks to me like the same reasoning applies with alternativeSchema.

tedepstein commented 5 years ago

BTW, the parallel with Example Object came up earlier in this discussion. @darrelmiller said we're using externalValue to follow the precedent of Example Object, while @earth2marsh argued that externalSchema would be clearer.

FWIW, I think value and externalValue make sense where you have a Foo Object as a narrowly scoped, thin wrapper around a Foo. You have a few properties that are clearly metadata about the Foo, and you have value or externalValue as the actual Foo entity. That naming is intuitive (to me) as long as Foo Object remains as a thin wrapper. It could get less intuitive if Foo Object later gets augmented with more substantial properties, to the point where it's less clear what value is referring to.

Bottom line, I don't have a strong opinion about that. My argument in support of inline value as an alternative to externalValue applies independently of the naming question. I'll be just as happy if it's schema and externalSchema. Or schemaValue and externalSchemaValue.

fmvilas commented 5 years ago

Food for thought:

What would be the downside of inlining something that's not JSON/YAML?
Are we really taking into account the user experience? It seems overkill to me that we have to use more than one file even if it's a simple/small API. All just because I use, say, protobuf.

aowss commented 5 years ago

I thought this proposal was for a pluggable type system, i.e. enabling the use of JSON Schema or any other schema language in place of the OpenAPI built-in schema. I think this is a great idea.

It seems combining different type systems is now part of this proposal.

This seems to complicate the syntax with the introduction of schema constructs, e.g. anyOf, within the schema element.
This assumes that these schema languages can be combined using these constructs. This might seem true at first glance but I am not sure it is as obvious as it seems, e.g. the use of default in these different languages.
The syntax for combining schema is different if one of the schema is an OAS schema
I am not clear about the real use case for that.

earth2marsh commented 5 years ago

Surely this is a problem that could be solved with RFC1341? Or better yet, CDATA? ;)

In all seriousness though, it already feels weird to me to express in YAML a syntax for describing JSON, but at least it's consistent with the YAML/JSON expression of the spec itself. In-lining other validation languages seems potentially perilous... I even balk at the idea of supporting multiple versions of JSON Schema in the spec just because there's benefit in having one narrow happy path.

@tedepstein for me the "value" is what you get when you dereference a thing, which is why I prefer externalSchema to externalSchemaValue. In this case, "external" connotes that the value lies elsewhere.

OTOH, +1 to @fmvilas 's point about considering the user experience carefully. No question that having XSD inlined simplifies the initial spec authoring task when trying to describe non-JSON formats. But it also bloats spec files over time as schemas and examples grow and grow. Having good multi-file support can help separate that to keep the core spec tighter, but it somewhat depends on having good tooling to support it.

One parting thought, "Why WADL when you can Swagger" goes all the way back to the naming origin where we laughed about how no one wanted to describe JSON APIs in XML, and yet here we are considering the reverse. At least history has a sense of humor?

cmheazel commented 5 years ago

@earth2marsh We are not using XML to define a JSON API. In addition to JSON, an API may deliver resources in XML, Protobuf, JPEG, etc. The use case behind alternative schema was delivery of a resource which is defined by an XML schema AND a set of Schematron rules. The OAS schema construct allows us to build a logical expression capturing the AND / OR relationship when multiple alternatative schema are applicable. That expression describes the resource, not the API. And logically we should use the proper descriptive language for the type of resource we are describing.

handrews commented 5 years ago

@fmvilas

If the external file is not JSON (e.g., Protobuf or XSD) you'll not be able to use JSON Reference fragment identifiers.

That's not quite true. $ref values are just URIs, and in URIs, the fragment syntax is determined by the media type of the target resource. So assuming XSD has a fragment syntax (I'm guessing it inherits XML's, but I'm too lazy to look right now), then $ref-ing an XSD with a fragment to reference some portion of it is valid. At least assuming it is ever valid to reference part of an XSD document (I know basically nothing about XSD).

Protobuf is more complicated b/c it does not have a media type and therefore cannot properly define a fragment syntax. Same with YAML, although one could make a case for JSON Pointer I suppose since using YAML as a media type involves making up an extension media type anyway- that's what OAS does and it works just fine. But those are liimtations of the formats, not limitations of $ref.

@tedepstein

In cases where the schema format is not YAML or JSON, they can be embedded as string values.

Modern JSON Schema accounts for this with contentMediaType and contentEncoding, so if OAS wants to support this we can still write a meta-schema for it if we're willing to move to a recent draft 😛

@earth2marsh

it already feels weird to me to express in YAML a syntax for describing JSON, but at least it's consistent with the YAML/JSON expression of the spec itself.

JSON Schema proper handles this by defining things over a data model which is derived from JSON rather than directly over JSON text. This allows for working with, say, YAML or CBOR as long as the mapping into the data model is clear. Working with XML would be challenging as the data model (elements, tags, contents, CDATA) is more complex.

darrelmiller commented 5 years ago

1) We spent a non-trivial amount of time deciding that we want to support multiple schema types for a number of reasons and it turned out that using the existing schema syntax was the cleanest way to do that.

2) If someone is considering using alternative schemas, they are already a long way past the "simple case". This is an advanced feature.

3) While inlining foreign schemas into seems like low hanging fruit, I don't think it necessarily is. It introduces inconsistencies in handling different types of schemas. It introduces escaping issues. It raises the questions of how future versions of JSON schema should be rendered in YAMLdocuments. We already have this issue with examples and the answers are not clear.

4) I would be willing to discuss how we might support inlined schemas in the future if deployment experience shows it is essential. We can ensure this would not be a breaking change.

philsturgeon commented 5 years ago

I am very confused about what use cases people are proposing. Lets ask a simple question: why on earth would somebody want to have their protobuf definitions inlined into OpenAPI? That would make them useless as they would have to split them out in order to use them for their actual code.

No. These protobufs already exist as .proto files, so referencing them from OpenAPI seems fine.

Smashing them in as strings or CDATA or something is all technically possibly (other than potential escaping troubles as @darrelmiller mentions), but I'm really scratching my head as to why you would want to do that.

tedepstein commented 5 years ago

@philsturgeon,

I am very confused about what use cases people are proposing. Lets ask a simple question: why on earth would somebody want to have their protobuf definitions inlined into OpenAPI? That would make them useless as they would have to split them out in order to use them for their actual code.

That depends entirely on the user and the toolchain. The splitting could be done by a code generator or some other downstream processor. And the potential benefit of having them in the OpenAPI file is readability and ease of maintenance. The exact same reason why most OpenAPI documents have their schemas inline, or in components/schemas, unless and until there's a good reason to move them to a separate file.

Tools might build in validation and other kinds of editing support for alternative schema languages. But some users could find it best to maintain these schemas in the source OpenAPI spec even without some of these editing and codegen features.

Smashing them in as strings or CDATA or something is all technically possibly (other than potential escaping troubles as @darrelmiller mentions), but I'm really scratching my head as to why you would want to do that.

The idea of "strings or CDATA or something" might make me nervous, except for the fact that we've been maintaining an OpenAPI editor for some years now, and have never really had a problem with strings. Between quoted and block styles, with folded and literal variants, YAML seems to have a string syntax for every occasion, and the parsing of string values seems very robust in the stack we're using (SnakeYAML + YEdit, with our own processing mostly written with Jackson, IIRC).

I would like to understand better what kinds of encoding issues we've seen with example properties and Example Objects.

The "can" vs. "should" question applies on another level as well. Just because the OpenAPI specification can decide for the entire ecosystem whether inline alternative schemas are a good idea or a bad one, doesn't mean it should.

If there are legitimate difficulties in supporting it, that might be a good reason to disallow this. But the judgment as to whether it's desirable and advisable really should be made by OpenAPI users and tool providers. I don't think we should be taking such a narrow view of what OpenAPI tools can do, and what usage patterns API designers might invent and embrace.

Seriously consider, for a moment, the possibility that something other than JSON Schema might emerge as a wildly popular schema language for OpenAPI, maybe for JSON message payloads, maybe for some other wire format, or both. And consider that this kind of unexpected evolution could be a great thing for OpenAPI and its user community.

Whether or not we think it's likely to happen, we should be laying the groundwork to allow for this kind of open evolution. If we impose a penalty on alternative schemas, taking them out of the flow, and making documents that use them harder to read and harder to maintain, we're stifling growth.

darrelmiller commented 5 years ago

Embedded foreign schemas into an OpenAPI document requires an OpenAPI parser that can extract the foreign schema and then hand them off to an external tool or plugin for validation. Having the foreign schemas in external files means that you can validate/process the foreign schemas with native schema tooling.

Embedded schemas that reference each other for the purpose of reuse would require a referencing system in the foreign schema that can somehow reference another embedded schema in a file format it has zero knowledge about. Examples tend not to have this requirement. Schemas reference each other all the time. How would an embedded XSD schema in a string create an xsd:import that references another XSD that is embedded into the same OpenAPI document? My brain hurts thinking about the implications of using JSON References as fragment identifiers inside an XSD.

I have no fundamental objection to embedding alternate schemas but I don't think we should delay releasing external alternate schemas while we figure out these details.

tedepstein commented 5 years ago

I have no fundamental objection to embedding alternate schemas but I don't think we should delay releasing external alternate schemas while we figure out these details.

I agree, but do we even need to work out those details now?

Maybe the most we would need to figure out is a small set of issues about the OpenAPI media type (#110). Questions that come to mind:

Is there a media type for OpenAPI documents?
What fragment syntax(es) are supported by the OpenAPI document? (Presumably JSON Pointer would be one of these, maybe the only one.)

Given the answers to those questions:

If a given schema type has a referencing scheme that can't work with OpenAPI's URLs and media type, then that schema type simply won't be able to refer to other schemas embedded in the OpenAPI document.
If that's a big enough problem, because that schema type is important, and the folks using it really want to embed their schemas in OpenAPI documents, it can be addressed by updating the alternative schema language itself, and/or by doing something fancier with the OpenAPI media type in a later release.

So I'm saying that whole set of problems can be deferred until there's a real demand to address it. Meanwhile, other alternative schema formats that are more amenable to OpenAPI's fragment syntax can benefit from inline schema support.

philsturgeon commented 5 years ago

So I'm saying that whole set of problems can be deferred until there's a real demand to address it.

Fully support kicking this can down the road. Hopefully it'll be kicked into infinity.

tedepstein commented 5 years ago

Could we please try to separate the technical challenges from the aesthetics of this idea?

@darrelmiller, on the technical concerns:

While inlining foreign schemas into seems like low hanging fruit, I don't think it necessarily is. It introduces inconsistencies in handling different types of schemas. It introduces escaping issues. It raises the questions of how future versions of JSON schema should be rendered in YAMLdocuments. We already have this issue with examples and the answers are not clear.

Questions:

Would you be able to point to some open issues and/or other resources so I can better understand where we've hit problems with inline examples?
Are the encoding/escaping problems not avoided by using literal-style multi-line strings in YAML? (i.e. value beginning with a | pipe character.)
Are the encoding/escaping problems likely to happen with text-based schema lanuages, or are they mainly a concern with non-text media types, like we'd expect to find in example values?
What are the questions about future versions of JSON Schema in YAML documents?

I can't say whether I think these are good reasons to defer support for inline alternative schemas, because I just don't have the background on those concerns yet.

But what I'm hearing is a kind of distaste for the idea. That we don't want to do it because it's messy, somehow force-fitting one language into a string value so we can shoehorn it into another language.

So, here are a some examples of embedded languages in OpenAPI today:

Markdown Embedded in OpenAPI
HTML Embedded in Markdown
CSS Embedded in HTML

paths:
  /products:
    get:
      summary: Product Types
      description: |
        The Products endpoint returns information about the _Beamup_
        products offered at a given location. The response includes:
        * the display name 
        * the SKU
        * current pricing
        * product image

        This diagram shows the full data structure:
        <img src="https://api.beamup.io/diagram1.png" style="width: 78px;" />

CSV Embedded in OpenAPI

      responses:  
        200:
          description: A price table in CSV format.
          content:
            text/csv:
              example: | 
                Product,PriceTier,MinimumQuantity,Price
                P1234,Retail,1,43.95
                P1234,Wholesale,100,33.24
                P1234,Wholesale,1000,29.48
                P9877,Retail,1,8.34
                P9877,Wholesale,100,7.02
                P9877,Wholesale,500,6.81
                P9877,Wholesale,1000,6.6

Regular Expressions Embedded in a Schema

    Contact:
      type: object
      properties:
        firstName:
          type: string
        lastName:
          type: string
        homePhone: 
          type: string
          pattern: |-
            ((\(\d{3}\) ?)|(\d{3}-))?\d{3}-\d{4}
        customerID:
          type: string
          pattern: |-
            ^\d{3}-\d{2}-\d{4}$

... and here's what it might look like to embed protobuf schemas:

Protobuf Embedded in OpenAPI

components:
  schemas:          
    Person:
      type: object
      x-oas-draft-alternativeSchemas:
        - type: protobuf
          value: |           
            syntax = "proto2";
            package contacts;

            message Person {
              required string name = 1;
              required int32 id = 2;
              optional string email = 3;

              enum PhoneType {
                MOBILE = 0;
                HOME = 1;
                WORK = 2;
              }

              message PhoneNumber {
                required string number = 1;
                optional PhoneType type = 2 [default = HOME];
              }

              repeated PhoneNumber phones = 4;
            }

This doesn't feel so onerous to me. We've all gotten so used to the current affordances for language embedding (Markdown, HTML, example data, regex, etc.) that we don't even think of OpenAPI as a polyglot language. But it is, and it works just fine.

And BTW, "just fine" doesn't mean there aren't rough edges here and there. Try explaining to a new OpenAPI user why her line breaks aren't being preserved, when they might be removed by a folded YAML string, or it might be the Commonmark processor downstream. It ain't perfect. But it's still really good. I don't hear anyone saying we should externalize all the description property values because Markdown is a different language.

Besides, whatever our gut reaction to these things, current and proposed, what I really want to say is that we shouldn't allow our decisions to be so heavily influenced by gut reactions.

We have really smart people here with strong opinions, and I don't want to dismiss anyone. If the thought of embedding foreign schema languages makes you want to barf, that's a fine starting point for a discussion. But it shouldn't end there, because these decisions have important implications in what users and tool providers can and can't do.

Generally, OpenAPI property values that can be externalized can also be embedded inline. That's true for documentation (externalDocs), schemas, examples, and probably others. License Object is one exception; it can only take a URL, not the text content of the license.

Disallowing inline alternative schemas for substantive technical reasons might be necessary. But disallowing them because some people think it's yucky doesn't seem like the right thing to do.

adjenks commented 5 years ago

Just allow inlined schemas as string types, then you can parse it as whatever you want as long as you know what format it's in.

x-oas-draft-alternate-schema:
    type: xml-schema
    schema: "<hello-world/>"

philsturgeon commented 5 years ago

"Just"? Strings are more complex than you give credit.

Please nobody has actually requested this feature and we're all just typing over and over about supporting it whilst saying we dont need it. Most of us are developers and as such we like to try and solve problems, but we are spending time on a problem nobody has requested. If we had a project or a product manager in here they'd both be telling us to get on with it.

External is fine.

tedepstein commented 5 years ago

@philsturgeon, we have customers who want this for JSON Schema, possibly other languages. I also have plans to support an alternative schema language. In both cases, we need inline/embedded alternative schemas in order to provide a reasonably good UX for editing.

It sounds like @fmvilas and @adjenks are both saying they would want and expect this.

The error messages in your screenshots would be addressed by using literal multi-line strings, i.e. adding a pipe character after the schema: key, and putting the schema content on separate lines, indented:

    x-oas-draft-alternate-schema:
      type: xml-schema
      schema: |
        <hello-world some-attr="Hi" />
        <hello-world some-attr="isn't this great" />

Folded strings (> instead of |) work for this purpose too, and most OpenAPI specs that have substantial markdown documentation are already using folded or literal multi-line strings for description properties. So it's not like we'd be requiring users to do something exotic.

handrews commented 5 years ago

@tedepstein do you not support multi-file specs at all?

tedepstein commented 5 years ago

@handrews , I'm fully in favor of OpenAPI allowing multi-file specs, and our tools provide robust multi-file support. It's great to have that option.

But I am not in favor of forcing multi-file specs where they're not needed.

If an API designer wants to use full JSON Schema or some other schema language, barring any specific technical reason that prevents it, it should be their choice to keep the schemas inline, or in external files.

BTW, I'm also surprised that we're spending so much time on this, because I thought this would be a no-brainer. We allow inline values for everything else (except license), so of course, we'll allow them here. If they don't work for you, or for your choice of schema language, feel free to use the external___ property or $ref to keep them external.

I think we're spending a lot of time on this because so much of the discussion has been around the value judgments of whether a particular usage is good or bad, pretty or ugly, as opposed to whether there's actually any harm in allowing it, and allowing users and tool providers to make their own judgments and do what they want.

I just don't understand the visceral reaction that we're opening some kind of Pandora's box with inline schemas. Like it's so dangerous that we have to break from our usual pattern of allowing inline values, raise the spectre of runaway complexity, etc., when all we're proposing is to follow the pattern we've established with description, examples, and all of the $ref-enabled properties: allow them inline, or reference them externally.

As I see it, this isn't about "adding a feature." It's about removing an intentional restriction on the feature. And I'm not seeing a rational basis for most of this controversy.

philsturgeon commented 5 years ago

Just a quick reminder to folks in the thread that JSON Schema is not currently supported inline, OpenAPI Schema Objects (which are loosely based on JSON Schema) are supported inline. If you want to use JSON Schema you have to link to an external file through alternatioveSchema.

We are not proposing we allow JSON Schema proper by shoving it into a YAML multi-line string, so I do not know why we would support other alternative schemas by shoving them into a YAML multi-line string.

Extra reminder: JSON does not support fancy > syntax for multi-line strings, so the examples I posted above would continue to fail for JSON users.

tedepstein commented 5 years ago

@philsturgeon ,

We are not proposing we allow JSON Schema proper by shoving it into a YAML multi-line string, so I do not know why we would support other alternative schemas by shoving them into a YAML multi-line string.

Objection to the word "shoving." We're not proposing to shove anything, any more than we're currently shoving multi-line markdown into description properties. See my comment here.

Extra reminder: JSON does not support fancy > syntax for multi-line strings, so the examples I posted above would continue to fail for JSON users.

Misleading. JSON can represent multi-line strings with character escaping, just as it currently does with embedded markdown, regular expressions, request and response examples. Embedded schemas would introduce absolutely no new JSON compatibility issues.

handrews commented 5 years ago

@tedepstein the thing I'm focused on here is that part of this proposal has a solid consensus behind it, while the inline part clearly does not (never mind why or whether it is reasonable, there is clearly no consensus and it will take effort to work through this).

I am in favor of decisively including the part that we have consensus on and continuing to discuss inlining separately. If inlining makes it into 3.1, great!

But please let's move ahead with the part we agree on. It does not make inlining harder to add if we decide to do it in 3.1.

I have a number of concerns about inlining, but I would prefer to not continue to discuss the topic here. I will chime in if we split it off.

tedepstein commented 5 years ago

@handrews , I would like to give others a little time to respond.

IMO, this debate has gotten dysfunctional. For my part, I have tried to keep it clear and focused, but there is a lot of smoke here. I am not at all happy to have this issue decided, or deferred, because the discussion degenerated into pressure tactics, memes, and misinformation. For the record, this sucks.

I will feel so much better letting this go, or taking it into a new issue, if your concerns about inlining, or @darrelmiller's, or anyone else's, turn out to be significant issues that we had not considered. So whether we end up discussing it here or elsewhere, if you get a few minutes to summarize those concerns, I'd like to hear them.

handrews commented 5 years ago

@tedepstein I think I'll also wait on others before going further into the substance of this issue.

Regarding the process, I am sympathetic to your frustration. Part of my suggestion for splitting this out into a separate topic is that, having dealt with more than a few thread of this nature in the JSON Schema project, splitting has been one of the most effective ways to de-escalate arguments and make progress where we were able to do so.

In this particular case, I am very concerned that alternativeSchema will miss the 3.1 release because of a late-added feature request that is not at all essential to alternativeSchema's usefulness. It may be essential for your tooling, but there are many people who are waiting on this feature who do not need inline schemas.

If alternativeSchema gets punted out of 3.1 over the question of inline schemas, I'm going to be more than a little upset over that 😢

So I don't really want to get into the merits. I want to ensure that the core of the proposal moves forward, because it's extremely important to me that alternativeSchema be in 3.1. Obviously as someone driving the JSON Schema project, the single most important thing for me in OpenAPI is to be able to use actual JSON Schemas.

aowss commented 5 years ago

I completely agree with @handrews' comment : inlining can be considered as a separate request. Being able to use other schema languages is essential for some of us.

I am not a big fan of the syntax but that's a separate issue and I seem to be the only one with this opinion so feel free to ignore that.

adjenks commented 5 years ago

Not a problem with me, fork the discussion. I'm not one to call the shots on what your deadlines are and what has the greatest priority. I'm just here to tell you that in the context of my project code it makes no sense to externalize the schemas, so I would like to be able to inline it eventually. In the meantime I can make workarounds and things. Feel free to prioritize however you deem fit, and fork the discussion. I don't want to stall things, just want to inform you of my needs and I thought this was the place to do that. I really like this project, so thank you for your hard work.

aowss commented 5 years ago

Thanks for coming back to me.

I am very aware of anyOf in JSON Schema.

1) My concern was the addition of schema-like keywords in a property that is supposed to just import / include schemas. I feel that we are mixing concerns by adding this ability in this property. It’s probably the job of the imported schema to do that. => IMHO this property should only be used to include an external schema. Nothing more, nothing less

2) The concept of anyOf is very JSON Schema-specific. What does it mean to do that when the schemas are not JSON Schema ?

3) Did anyone express the need to include different schemas that are using the different schema languages ? => I am afraid that, as is the case with inlining non-JSON schemas in this property, we are trying to solve something bigger than the initial ask, i.e. include an external schema. In this specific case, I am not even sure that there is a real use case for that ( I might be wrong about this ).

In conclusion, I think that we should focus this feature on the ability to include an external schema and deal with other asks as separate features. The inclusion of external schemas is a priority for a lot of people. A lot of us want their schema system to be separate from their API description. This allows us to use the latest version of JSON Schema for example even though OpenAPI supports a different version.

Thanks again for coming back to me on this, Aowss

On Apr 3, 2019, at 12:32 PM, Henry Andrews notifications@github.com wrote:

@aowss https://github.com/aowss I can perhaps address some of your syntax concerns:

This seems to complicate the syntax with the introduction of schema constructs, e.g. anyOf, within the schema element. This assumes that these schema languages can be combined using these constructs. This might seem true at first glance but I am not sure it is as obvious as it seems, e.g. the use of default in these different languages. The semantics of JSON Schema's anyOf (and many other keywords) have been clarified in recent drafts. The keyword takes an array of subschemas, evaluates them according to their own rules, and logically ORs the validation results.

In the context of JSON Schema proper, the subschemas are expected to also be JSON Schemas, but really the evaluation model lets them be opaque. So the way anyOf is being used here is entirely consistent with JSON Schema.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/OAI/OpenAPI-Specification/issues/1532#issuecomment-479564033, or mute the thread https://github.com/notifications/unsubscribe-auth/AGAZHpESBKXTyeBTojFPvu4MkuByE0BQks5vdNeAgaJpZM4TL4FX.

cmheazel commented 5 years ago

@aowss I work with systems that return XML. The response undergoes syntax validation through an XML Schema document and valid value validation through a set of Schematron rules. Due to the complexity of Schematron, these rules are packaged in multiple Schematron files. So the answer to question OAI/OpenAPI-Specification#3 is yes.

Relequestual commented 5 years ago

Is this resolved by https://github.com/OAI/OpenAPI-Specification/pull/1270 ?

OAI / OpenAPI-Specification