mokkabonna / json-schema-merge-allof

Simplify your schema by combining allOf
97 stars 23 forks source link

Unit tests with patterns #44

Open miniHive opened 8 months ago

miniHive commented 8 months ago

The unit tests create patterns like (?=bar)(?=foo), e.g. in the test that merges contains.

Yet this resulting pattern seems to be unsatisfiable.

The following instances are valid w.r.t. the original schema but not the merged schema:

[
  {
    "name": "foo-and-bar"
  }
]
[
    {
        "name": "foo"
    },
    {
        "name": "bar"
    }
]
mokkabonna commented 8 months ago

Thanks for reporting!

I must have not tested this properly: https://github.com/mokkabonna/json-schema-merge-allof/pull/2

Should have left it as it was.

mokkabonna commented 8 months ago

The first case could be solved by merging it in the same way, but by adding .*like this: (?=.*bar)(?=.*foo). See https://stackoverflow.com/a/470602/94394

I am tempted however to make it unresolved, not 100% sure that will work for all possible regex expressions. Also it could get rather hard to read if many regex patterns.

Thoughts?


But the second case is not really related to pattern, but to contains.

I don't really think I can combine allOf from multiple contains as I do now. As it is the combination of all the keywords in one contains that decides if it matches. So if we move some subkeywords from different contains together then it might not match a single item all in one. Even though seperately they would match, like you described above.

However inside each contains, there we can merge allOf as normal. This might even apply to some other keywords, will need to check.

miniHive commented 7 months ago

Regarding merging the patterns: We have built a tool that can check two (Draft6) schemas for equivalence (https://jsonschematool-370416.oa.r.appspot.com/). However, our tool does not support lookaheads in pattern, so there we would rewrite the pattern to "(foo.bar)|(bar.foo)".

Nevertheless, one cannot and-merge two distinct "contains" into one "contains", because (exists x.P(x) And exists x.Q(x)) is not equivalent to (exists x. (P(x) And Q(x)).

In running your unit tests through our tool, we found several cases where the input schema and merged schema are not equivalent, for instance here https://github.com/mokkabonna/json-schema-merge-allof/blob/ea2e48ee34415022de5a50c236eb4793a943ad11/test/specs/properties.spec.js#L512

The instance

{
    "000": true
}

is valid w.r.t. one schema but not the other.

We are happy to share further details.

mokkabonna commented 7 months ago

Yeah you are right.

This is probably an issue with oneOf/anyOf as well. I must have been a bit quick in my thinking when doing those. Although looking at it now it is pretty obvious those can not be merged as they depend on 1 or 1 or more schemas to match IN THAT allOf schema.

I have myself not used those much, at least not with conflicting values, so have not discovered it.

I have started the process of cleaning up this. What is most important for me is the lossless nature of the merging process. Better to leave the resulting schema a bit more verbose than it could be than to change the validation result.

See #45

Still there is more to do, as your example proves, for complex resolvers like properties/patternproperties.

mokkabonna commented 7 months ago

If you do have readily available more test cases of instances validating differently I would appreaciate to have a look at them yes if that is possible. :)

miniHive commented 7 months ago

We are happy to share our findings, and we would also be very interested in your feedback, whether we unserstood the intention of your tests.

(1) https://github.com/mokkabonna/json-schema-merge-allof/blob/ea2e48ee34415022de5a50c236eb4793a943ad11/test/specs/properties.spec.js#L512

Valid w.r.t. merged schema, but not the original schema:

{
    "000": false
}

(2)

https://github.com/mokkabonna/json-schema-merge-allof/blob/ea2e48ee34415022de5a50c236eb4793a943ad11/test/specs/properties.spec.js#L212

Valid w.r.t. original schema, but not the merged schema:

{
    "bar0": 123
}

(3)

https://github.com/mokkabonna/json-schema-merge-allof/blob/ea2e48ee34415022de5a50c236eb4793a943ad11/test/specs/index.spec.js#L586

Valid w.r.t. original schema, but not the merged schema:

"\u0000\u0000\u0000\u0000\u0000"

(4) https://github.com/mokkabonna/json-schema-merge-allof/blob/ea2e48ee34415022de5a50c236eb4793a943ad11/test/specs/index.spec.js#L1022

(This one we already discussed)

(5)

https://github.com/mokkabonna/json-schema-merge-allof/blob/ea2e48ee34415022de5a50c236eb4793a943ad11/test/specs/index.spec.js#L917

Valid w.r.t. the original schema, but not the merged schema:

{
    "123": 123,
    "abc": 123,
    "name": "\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000"
}

Vice versa for the instance true.

(6)

https://github.com/mokkabonna/json-schema-merge-allof/blob/ea2e48ee34415022de5a50c236eb4793a943ad11/test/specs/index.spec.js#L1678

I assume the merged schema is this (I might be mistaken):

{
  "properties": {
    "person": {
      "$ref": "#/definitions/person"
    }
  },
  "definitions": {
    "person": {
      "properties": {
        "name": {
          "minLength": 8,
          "maxLength": 10,
          "type": "string"
        },
        "prop1": {
          "minLength": 7
        }
      }
    }
  }
}

Valid w.r.t. merged schema, but not the original schema:

{
    "person": {
        "child": {
            "prop1": ""
        }
    }
}

(7)

https://github.com/mokkabonna/json-schema-merge-allof/blob/ea2e48ee34415022de5a50c236eb4793a943ad11/test/specs/index.spec.js#L863

From the code, this seema seems to be considered unsatisfiable? But maybe I misunderstood the intent. But the instance false is actually valid.

(8)

https://github.com/mokkabonna/json-schema-merge-allof/blob/ea2e48ee34415022de5a50c236eb4793a943ad11/test/specs/index.spec.js#L917

Valid w.r.t. the original schema, but not the merged schema:

{
    "123": 123,
    "abc": 123,
    "name": "\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000"
}

Vice versa for the instance true.

AlimovSV commented 6 months ago

Yet another case which looks is not valid: the source schema

{
  "type": "array",
  "items": {
    "type": "object",
    "properties": {
      "type": {
        "type": "string"
      }
    }
  },
  "allOf": [
    {
      "contains": {
        "type": "object",
        "properties": {
          "type": {
            "pattern": "1"
          }
        }
      }
    },
    {
      "contains": {
        "type": "object",
        "properties": {
          "type": {
            "pattern": "2"
          }
        }
      }
    }
  ]
}

merged schema looks absolutely wrong:

{
  "type": "array",
  "contains": {
    "type": "object",
    "properties": {
      "type": {
        "pattern": "(?=1)(?=2)"
      }
    }
  },
  "items": {
    "type": "object",
    "properties": {
      "type": {
        "type": "string"
      }
    }
  }
}

from my memory, v0.6 merged as:

{
  "type": "array",
  "items": {
    "type": "object",
    "properties": {
      "type": {
        "type": "string"
      }
    }
  },
  "contains": {
    "type": "object",
    "properties": {
      "type": {
        "allOf": [
          {
            "pattern": "1"
          },
          {
            "pattern": "2"
          }
        ]
      }
    }
  }
}