require `$schema` in schemas

gregsdennis commented 1 year ago

Resolves #1420

gregsdennis commented 1 year ago

There are four rules for determining the dialect

This PR is a realization of the discussion we had in #1420. In that discussion, we identified three prioritized steps:

The $schema keyword

The schema media type parameter (if the schema was retrieved over HTTP) (optional)

The user provides a default through an implementation's API in some way. (optional)

What is this fourth rule?

jdesrosiers commented 1 year ago

The fourth rule is an embedded schema inheriting from its parent context. I think it belongs after the media-type parameter and before the user provided default.

I had forgotten about that one when I wrote that issue, but it's definitely part of the process. If you'd rather address that in a separate PR, I'm ok with that.

gregsdennis commented 1 year ago

I disagree then. And this needs to be discussed in that issue, not in this PR.

jdesrosiers commented 1 year ago

Can't wait to see the tests for this get merged! =D

I'm not sure much, if any, of this testable with our current setup.

Relequestual commented 1 year ago

Are we expecting that implementations need to have a special evaluation mode when validating schemas, and somehow switch to a different metaschema when $schema keywords are encountered?

Yup.

We don't spell that out in the spec in any way, and instead assume that when a schema is being evaluated against its metaschema, we treat it just like any other data instance.

First, we have to look at the definition of a Compound Schema Document...

A Compound Schema Document is defined as a JSON document (sometimes called a "bundled" schema) which has multiple embedded JSON Schema Resources bundled into the same document to ease transportation. - Core 9.3

I would argue that this defines any OpenAPI definition as a compound schema document. While the examples in the apendix section is a JSON Schema, it doesn't have to be.

Then, the validation section...

Given that a Compound Schema Document may have embedded resources which identify as using different dialects, these documents SHOULD NOT be validated by applying a meta-schema to the Compound Schema Document as an instance. It is RECOMMENDED that an alternate validation process be provided in order to validate Schema Documents. Each Schema Resource SHOULD be separately validated against its associated meta-schema. - Core 9.3.3

Given this, we might want to tighten up the language of meta-schema useage for validation to make the target a "schema resource" to make this clearer? I don't know. Open to thoughts.

gregsdennis commented 1 year ago

Are we expecting that implementations need to have a special evaluation mode when validating schemas, and somehow switch to a different metaschema when $schema keywords are encountered? - @karenetheridge

Yup. - @Relequestual

I'm with Karen. I think a meta-scheam evaluating a schema should be no different than a schema validating an instance. I don't think it's too hard to achieve this.

Still, let's work that out separately from this PR, please.

I don't know if it's been discussed elsewhere but there is still a problem with allowing $schema keywords to appear in non-root locations in the document -- because we also require that the schema must validate against its metaschema, and this requirement can be violated if a contained sub-schema uses a different dialect than the document root that conflicts in some way. - @karenetheridge

I would argue that this defines any OpenAPI definition as a compound schema document. - @Relequestual

I don't think the comment was about OpenAPI, but rather any time a $schema is used in a subschema resource, which is allowed currently.

If the parent schema declares 2020-12 (which disallows array items, and it contains a schema resource that declares draft 7 which also contains an array items (valid in draft 7), then a pure meta-schema evaluation of the root schema would fail.

I don't know that the Compound Document definition applies to the above scenario like it does for bundling. I think it could be argued both ways.

I agree that there's no real way to handle this right now. Let's take another PR for that and consider this PR as a mere iteration.

These two topics combined intersect somewhat with this discussion about $schema in instances.

Relequestual commented 1 year ago

I don't think the comment was about OpenAPI, but rather any time a $schema is used in a subschema resource, which is allowed currently.

If the schema is embedded in a larger document (such as OpenAPI)... - One of @karenetheridge comments.

However, I agree, it applies more broadly.

Still, let's work that out separately from this PR, please.

I agree. As a reference, we discussed this at quite some length already: https://github.com/json-schema-org/json-schema-spec/issues/936

karenetheridge commented 1 year ago

I agree. As a reference, we discussed this at quite some length already: https://github.com/json-schema-org/json-schema-spec/issues/936

That issue is talking about how to evaluate with a schema when there are embedded $schema keywords that change the dialect midway through. The problem I raised is how to evaluate that schema as a data instance against its metaschema -- because the metaschema you're evaluating with is changing midway through the evaluation, but the normal evaluation process doesn't know anything about switching schemas depending on what it finds in the data.

gregsdennis commented 1 year ago

@karenetheridge I get what you're saying. I'll create an issue for it for us to discuss.

json-schema-org / json-schema-spec

require `$schema` in schemas #1434