hyperjump-io / json-schema

JSON Schema Validation, Annotation, and Bundling. Supports Draft 04, 06, 07, 2019-09, 2020-12, OpenAPI 3.0, and OpenAPI 3.1
https://json-schema.hyperjump.io/
MIT License
216 stars 22 forks source link

Does not consider schema valid if top level `$ref` #39

Closed dhruvkb closed 11 months ago

dhruvkb commented 11 months ago

Hyperjump does not consider the following schema as valid.

{
  "$ref": "#/definitions/Coords",
  "$schema": "http://json-schema.org/draft-07/schema#",
  "definitions": {
    "Coords": {
      "type": "object",
      "properties": {
        "lat": { "type": "number" },
        "long": { "type": "number" }
      },
      "required": ["lat", "long"]
    }
  }
}

It shows the following error: Hyperjump - JSON Schema Validator

Other validators consider it valid and can use it to validate other JSON documents.

JSON Schema Lint __ JSON Schema Validator

Screenshot 2023-09-20 at 12 30 37 PM
jdesrosiers commented 11 months ago

This example doesn't use $ref correctly. The behavior is at best undefined when used this way and you shouldn't expect it to work the same across implementations. This is how $ref is defined in draft-07.

An object schema with a "$ref" property MUST be interpreted as a "$ref" reference. The value of the "$ref" property MUST be a URI Reference. Resolved against the current URI base, it identifies the URI of a schema to use. All other properties in a "$ref" object MUST be ignored.

I highlighted the important parts. An object with a $ref property is to be interpreted as a "reference", not a JSON "object". Think of it as a distinct type in addition to the standard JSON types. If we weren't constrainted strictly to valid JSON syntax in JSON Schema, a reference type would have distinct syntax to differentiate it from an object. For example, it could be expressed using angle brackets, <"$ref": "#/definitions/Coords">. Replacing those curly brackets with angle brackets in your mind when looking at references can help in understand this concept.

So, your example schema isn't technically a schema, but a reference with a bunch of extra data that's supposed to be ignored. Even I don't handle this 100% correct because I don't ignore the $schema and $id keywords in a case like this.

Assuming that an implementation accepts a reference in place of a schema and tries to evaluate it, the JSON Pointer /definitions/Coords points to a location within a reference type. Since we said that a reference is not technically a JSON object, it's ambiguous whether a JSON Pointer should be able to index into it like an object or if it should be treated as a scalar like a string. Most implementations allow pointers into references, but this implementation treats it as a scalar.

Assuming that the implementation does allow you to index into a reference type, technically anything you point to should be treated as a schema and that's why this example works in some implementations in simple situations. However, you should never reference something that isn't otherwise interpreted as a schema because there can be ambiguities regarding the identifier and/or dialect of the referenced schema.

So, successfully evaluating your example schema relies on multiple undefined behaviors that you can't rely on being consistent across implementations. The correct way to do what the example schema is trying to do is to wrap your $ref in an allOf ("allOf": [{ "$ref": "#/definitions/Coods" }]). This way you're using $ref the way it's intended to be used and don't have to rely on undefined (and therefore inconsistent) behaviors.

dhruvkb commented 11 months ago

Your argument makes sense. I don't fully understand the JSON Schema's specification regarding URIs and so I didn't know that this behaviour was undefined.

In my case the schema is autogenerated by https://github.com/vega/ts-json-schema-generator so I'll use the --no-top-ref flag to keep the top level object expanded.