IBM / jsonsubschema

Tool for checking whether a JSON schema is a subschema of another JSON schema.
Apache License 2.0
82 stars 17 forks source link

If RHS has optional fields, it isn't considered a subschema, inhibiting schema evolution #24

Open crdoconnor opened 1 month ago

crdoconnor commented 1 month ago

We are currently looking for a tool that can check schema compatibility.

One of the examples I used to test this software is to validate is to check a producer with additional unused fields works with a consumer without them. This is a common schema evolution pattern and this works with jsonsubschema. :heavy_check_mark:

However, a second schema evolution pattern we follow is for the consumer to add a field as optional and then to start using it as soon as the producer starts supplying it.

This does not seem to work with jsonsubschema.

Example consumer.json:

{
    "properties": {
        "required": {
            "title": "Required",
            "type": "string"
        },
        "onlyusefieldifpresent": {
            "anyOf": [
                {
                    "type": "string"
                },
                {
                    "type": "null"
                }
            ],
            "default": null,
            "title": "onlyusefieldifpresent"
        }
    },
    "required": [
        "required"
    ],
    "title": "Consumer",
    "type": "object"
}

and this producer.json:

{
    "properties": {
        "required": {
            "title": "Required",
            "type": "string"
        },
        "unusedfield": {
            "title": "Unusedoption",
            "type": "string"
        }
    },
    "required": [
        "required",
        "unusedfield"
    ],
    "title": "Producer",
    "type": "object"
}
>>> jsonsubschema producer.json consumer.json
LHS <: RHS False

Is this intentional behavior? Technically I suppose producer.json is not a subschema of consumer.json, but it is a subset of consumer.json's required fields.

hirzel commented 1 month ago

Yes, this is expected behavior. Neither is a subschema of the other, because both accept instances that the other disallows. See example code below.

import jsonschema

producer_schema = {
    "properties": {
        "required": {
            "title": "Required",
            "type": "string"
        },
        "unusedfield": {
            "title": "Unusedoption",
            "type": "string"
        }
    },
    "required": [
        "required",
        "unusedfield"
    ],
    "title": "Producer",
    "type": "object"
}

consumer_schema = {
    "properties": {
        "required": {
            "title": "Required",
            "type": "string"
        },
        "onlyusefieldifpresent": {
            "anyOf": [
                {
                    "type": "string"
                },
                {
                    "type": "null"
                }
            ],
            "default": None,
            "title": "onlyusefieldifpresent"
        }
    },
    "required": [
        "required"
    ],
    "title": "Consumer",
    "type": "object"
}

producer_instance = {
    "required": "a",
    "unusedfield": "b",
    "onlyusefieldifpresent": 123,
}

consumer_instance = {
    "required": "a",
    "unusedfield": 123,
    "onlyusefieldifpresent": "c",
}

jsonschema.validate(instance=producer_instance, schema=producer_schema)
jsonschema.validate(instance=consumer_instance, schema=consumer_schema)

print("----- checking producer_instance against consumer_schema -----")
try:
    jsonschema.validate(instance=producer_instance, schema=consumer_schema)
except jsonschema.ValidationError as e:
    print(e)
else:
    assert False, "expected ValidationError"

print("----- checking consumer_instance against producer_schema -----")
try:
    jsonschema.validate(instance=consumer_instance, schema=producer_schema)
except jsonschema.ValidationError as e:
    print(e)
else:
    assert False, "expected ValidationError"

This prints:

----- checking producer_instance against consumer_schema -----
123 is not valid under any of the given schemas

Failed validating 'anyOf' in schema['properties']['onlyusefieldifpresent']:
    {'anyOf': [{'type': 'string'}, {'type': 'null'}],
     'default': None,
     'title': 'onlyusefieldifpresent'}

On instance['onlyusefieldifpresent']:
    123
----- checking consumer_instance against producer_schema -----
123 is not of type 'string'

Failed validating 'type' in schema['properties']['unusedfield']:
    {'title': 'Unusedoption', 'type': 'string'}

On instance['unusedfield']:
    123