Closed Azquelt closed 2 months ago
Here are my initial thoughts:
At a high level, the new JSON schema is very flexible. The core spec defines the idea of a "vocabulary" - a set of properties which have a documented meaning in a JSON schema - and a "dialect" - a set of vocabularies which are supported.
The core spec defines a core vocabulary, and two vocabularies for applying subschemas. The validation spec defines vocabularies for validating the structure and contents of a JSON document, a vocabulary for annotating a schema with metadata like a description and a dialect which includes all the required vocabularies of the core and validation specs.
The OpenAPI spec then defines its own vocabulary containing some extensions, and a dialect which requires the required JSON schema vocabularies and the OpenAPI vocabulary.
By default, all schemas use the dialect defined by OpenAPI, but they can also choose to use a different dialect by using the $schema
property in their schema, or by setting jsonSchemaDialect
at the top level which sets the default dialect for all schema objects in the document.
In theory, a user can write their OpenAPI document using any dialect of JSON schema they like, as long as they declare it in the document. If we want to be able to read any OpenAPI document the user might package in their application, our model would need to handle any arbitrary JSON document as a schema.
Here are the things that I think have changed in the schema between 3.0 and 3.1 that would need to be reflected in our model:
$schema
(string) - identifies the dialect in use for a schema$comment
(string)if
, then
, else
(schema) - if the object validates against the if
schema, then it must also validate against the then
schema, otherwise it must validate against the else
schemadependentSchemas
(object(propertyname -> schema)) - if the object has a property with the given name, then the object must validate against the schemaprefixItems
(array[schema]) - if the object is an array, its the first item must validate against the first schema in prefixItems
, the second must validate against the second schema etc.contains
(schema) - if the object is an array, at least one item in the array must match the schemapatternProperties
- (object(regex -> schema)) - if a property name matches the regex, then the property value must validate against the schemapropertyNames
- (schema) - each property name in the object must validate against the schemaunevaluatedItems
- (schema) - each array item not matched by prefixItems
, items
or contains
must validate against the schemaunevaluatedProperties
- (schema) - each property value not matched by properties
, patternProperties
or additionalProperties
must match the schema
additionalProperties
, but unevaluatedProperties
won't check properties which have been matched by a subschema applied with allOf
, oneOf
, then
, else
etc. whereas additionalProperties
will.$ref
- not mutually exclusive with other propertiesadditionalProperties
- must now be a schema, previously a boolean was also allowed (though a schema itself is now allowed to be a boolean)exclusiveMinimum
, exclusiveMaximum
- now numbers, previously booleansreadOnly
, writeOnly
- now valid anywhere, previously only valid where the schema describes an object propertyconst
- specifies that the object must have a specific valuemaxContains
, minContains
- (integer) specifies that contains
must match between minContains and maxContains items in the arraydependentRequired
- (object(string -> array[string])) describes property names that if they are present, certain other property names must also be presentcontentEncoding
- (string) specifies that a string represents encoded binary data with the given encoding type (e.g. base64)contentMediaType
- (string) specifies that the content of a string has the given media typecontentSchema
- (schema) if contentMediaType
is a media type that maps into JSON Schema's data model, this property specifies a schema that the data in the string must conform to
type
- previously a string, now may also be an array of strings. "null"
is now a valid value hereexample
-> examples
- more than one example is now allowednullable
- now expressed by including "null"
in the array of valid typestrue
or false
values are now valid as schemas.
The following fields are new in the core schema, but I'm not sure if they're relevant to its use in OpenAPI:
$id
- the canonical URI of a schema
$anchor
- this is used to name an element in a schema and reference it with this name using $ref
elsewhere
$defs
- allows for defining parts of a schema for re-use later
components
components
$dynamicAnchor
, $dynamicRef
- allows for schemas split across different documents to extend each other
I don't think users of OpenAPI are likely to want to use these fields, but we need to accommodate them anyway since they're valid so we must be able to read them from a user-supplied document.
However, since OpenAPI permits arbitrary dialects, our model may need to allow arbitrary JSON as the schema anyway. If we include a mechanism to allow arbitrary additional properties, we could say that these properties can only be set through that mechanism.
A few things I noticed while trying to implement this for smallrye:
The semantics of nullable
vs. having null
in the list of types is subtly different. While nullable
can betrue
, false
or unset, null
is either in the list of types or it isn't. Also, nullable = true
has no effect if type
is not set. Whereas you could previously do this for an optional field:
{
"nullable": "true",
"allOf": {
"$ref": "..."
}
}
With the new schema this requires an anyOf
.
The Extensible
interface is now fairly clear throughout that it only works on properties beginning x-
. addExtension
can be interpreted as adding x-
to the start of any key which doesn't already have that prefix. Having Extensible
work this way allows for consistent handling of types which support extensions, so I think we should leave it as it is and create a new interface for "freeform" objects.
Wouldn't having null
absent from the type
array be semantically equivalent to nullable: false
or undefined/unset?
For the second issue, I think omitting type
entirely implies any type, including null
. Basically, the value is unconstrained.
Yes, you can have a semanitcally equivalent end result, but it makes trying to keep existing code which uses the interface working difficult.
At the moment, I've deprecated setNullable
and the single argument setType
, thinking that they could both be implemented by manipulating the list of types.
However, you can't quite do that consistently. If the user calls setNullable
but never calls setType
, you don't want to end up with "type": ["null"]
in the schema, since that forbids anything that's not null
. However if they do call setType(OBJECT)
, you do want "type": ["null", "object"]
so would need to store a flag somewhere to say that nullable has been requested.
Whether that's an issue or not depends on how you implement freeform objects. I tried making the Schema
implementation a thin wrapper around a JSON object, but then you have nowhere to store data except within the JSON.
OAS 3.1.0 changes from supporting most of an older JSON schema draft to supporting the whole of the JSON Schema 2020-12 draft Core and Validation.
We need to work out:
Schema
model class to support the 3.1.0 schema@Schema
and related annotations in order to allow users to take advantage of the new functionality that is availableTasks:
Schema
model@Schema
annotation to expose new and changed parts of the schema model #601@Schema
attributes #601