everit-org / json-schema

JSON Schema validator for java, based on the org.json API
Apache License 2.0
859 stars 281 forks source link

StackOverflowError on combined schemas #17

Closed ujibang closed 8 years ago

ujibang commented 8 years ago

I have defined two schemas. One extends the other using allOf and $ref

When I try to validate a json document I get StackOverflowError exception.

I have done some debugging and even if I pass the child schema to the SchemaLoader.load() method, the SchemaClient seems loads child again via http. I think it should load parent instead.

Test document passed to validate:

{n:1, s: "test" }

parent schema

{
  "_$schema":"http://json-schema.org/draft-04/schema#"
  "id":"http://127.0.0.1:8080/test/_schemas/parent#",
  "type":"object",
  "properties":{ "n":{ "type":"number" } },
  "required":[ "n" ]
}

child schema

{
  "_$schema":"http://json-schema.org/draft-04/schema#",
  "id":"http://127.0.0.1:8080/test/_schemas/child#",
  "allOf":[ 
    { "_$ref":"parent" },
    {
      "_$schema":"http://json-schema.org/draft-04/schema#",
      "required":[ "s" ],
      "type":"object",
      "properties":{ "s":{ "type":"string" } }
    }
  ],
  "additionalProperties":false
}

Exception

java.lang.StackOverflowError: null
    at org.everit.json.schema.CombinedSchema.succeeds(CombinedSchema.java:155) ~[org.everit.json.schema-1.1.0.jar:na]
    at org.everit.json.schema.CombinedSchema.lambda$validate$9(CombinedSchema.java:165) ~[org.everit.json.schema-1.1.0.jar:na]
    at org.everit.json.schema.CombinedSchema$$Lambda$90/207956499.test(Unknown Source) ~[na:na]
    at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:174) ~[na:1.8.0]
    at java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948) ~[na:1.8.0]
    at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:512) ~[na:1.8.0]
    at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:502) ~[na:1.8.0]
    at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) ~[na:1.8.0]
    at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[na:1.8.0]
    at java.util.stream.LongPipeline.reduce(LongPipeline.java:438) ~[na:1.8.0]
    at java.util.stream.LongPipeline.sum(LongPipeline.java:396) ~[na:1.8.0]
    at java.util.stream.ReferencePipeline.count(ReferencePipeline.java:526) ~[na:1.8.0]
    at org.everit.json.schema.CombinedSchema.validate(CombinedSchema.java:166) ~[org.everit.json.schema-1.1.0.jar:na]
    at org.everit.json.schema.ReferenceSchema.validate(ReferenceSchema.java:61) ~[org.everit.json.schema-1.1.0.jar:na]
    at org.everit.json.schema.CombinedSchema.succeeds(CombinedSchema.java:155) ~[org.everit.json.schema-1.1.0.jar:na]
    at org.everit.json.schema.CombinedSchema.lambda$validate$9(CombinedSchema.java:165) ~[org.everit.json.schema-1.1.0.jar:na]
    at org.everit.json.schema.CombinedSchema$$Lambda$90/207956499.test(Unknown Source) ~[na:na]
...
erosb commented 8 years ago

Hello @ujibang !

It seems to be 2 more or less separate problems. If CombinedSchema.succeeds()gets even executed then the schema loading itself succeeded, so the StackOverflowError has nothing to do with the schema loading process. It would be very helpful if you could send me inputs (schema and example document to be validated) for simply reproducing this bug.

"even if I pass the child schema to the SchemaLoader.load() method, the SchemaClient seems loads child again via http"

I'm not sure what is the expected behavior in your case. I'll dig into the relevant sections of the schema spec 7.2.2 - 7.3.3 .

Edit: So your resolution scope is "http://127.0.0.1:8080/yaala/_schemas/child#" and you refer to an other schema using the string "parent", and

We are both wrong :) To be honest I never liked the remote schema handing of the json schema specification, it is quite blurry, illogical and hard to understand.

I'll look after this issue in the code.

ujibang commented 8 years ago

Hi @erosb

I read here that the id property "declares a base URL against which $ref URLs are resolved".

And this is the parent URI of the id. Section 7.2.2 mandates:

When an id is encountered, an implementation MUST resolve this id against the most immediate parent scope. The resolved URI will be the new resolution scope for this subschema and all its children, until another id is encountered

In my case it should be http://127.0.0.1:8080/yaala/_schemas just like the "schema2" and "alsonested" examples refer to the id parent URI.

If $ref value starts with # then it should resolve against the current schema.

I'm new to json schema so I might be wrong..this is what I understood so far.

Regarding the test data to reproduce the issue, you can use what I pasted in the first comment.

ujibang commented 8 years ago

Hi @erosb,

I just thought that if resolving child leads to load child, than after loaded it, it resolves child leading to load it again and so on recursively...

erosb commented 8 years ago

Yes, that is exactly what happens. I figured that out too, today :) I will try to fix this as soon as my time permits. The bugfix will be released (most probably) on next week.

ujibang commented 8 years ago

Hi @erosb

any news about this issue?

Thanks

erosb commented 8 years ago

@ujibang sort of, started working on it yesterday, here: https://github.com/erosb/json-schema-validator/commits/master

erosb commented 8 years ago

Hello @ujibang , I think I got through this. I've just pushed my latest version to this repo. But I don't want to initiate a new release without your clarification, because it got quite messy. So I'd like to ask you to check if it is ok for you. Please clone my repo, build it locally (using mvn clean install) then add the org.everit.json:org.everit.json.schema:1.1.1-PREVIEW version to your project.

I think the current implementation conforms to the spec section 7.2.2, but the resolution scope alteration tests in the test suite failed, therefore I removed that test from the build. It seems to me that testcase is against the specification, but I'm not absolutely sure about it. I also checked some other implementations, but

To sum up, I think my current implementation conforms to the spec, but I'd be very thankful if you could clarify it.

ujibang commented 8 years ago

Hi @erosb

thanks for the update. I tried to build the project from your repo but it fails with the following error.

Any advice?

json-schema-validator $ (master) mvn clean install
...
[INFO] -------------------------------------------------------------
[ERROR] COMPILATION ERROR : 
[INFO] -------------------------------------------------------------
[ERROR] /Users/uji/development/json-schema-validator/core/src/main/java/org/everit/json/schema/loader/SchemaLoader.java:[435,15] incompatible types: inferred type does not conform to upper bound(s)
    inferred: ?
    upper bound(s): java.lang.Object
[INFO] 1 error
....
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
...
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on project org.everit.json.schema: Compilation failure
[ERROR] /Users/uji/development/json-schema-validator/core/src/main/java/org/everit/json/schema/loader/SchemaLoader.java:[435,15] incompatible types: inferred type does not conform to upper bound(s)
[ERROR] inferred: ?
[ERROR] upper bound(s): java.lang.Object
...
erosb commented 8 years ago

@ujibang I also ran into this type inference issue previously. It compiles successfully with jdk1.8.0_45 , but it failed with previous versions.

ujibang commented 8 years ago

updating jdk fixed the build issue, going to try 1.1.1-PREVIEW with my use case

ujibang commented 8 years ago

Sorry, still StackOverflowError!

java.lang.StackOverflowError: null
    at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:174) ~[na:1.8.0_66]
    at java.util.Spliterators$ArraySpliterator.tryAdvance(Spliterators.java:958) ~[na:1.8.0_66]
    at java.util.stream.ReferencePipeline.forEachWithCancel(ReferencePipeline.java:126) ~[na:1.8.0_66]
    at java.util.stream.AbstractPipeline.copyIntoWithCancel(AbstractPipeline.java:498) ~[na:1.8.0_66]
    at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:485) ~[na:1.8.0_66]
    at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471) ~[na:1.8.0_66]
    at java.util.stream.FindOps$FindOp.evaluateSequential(FindOps.java:152) ~[na:1.8.0_66]
    at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[na:1.8.0_66]
    at java.util.stream.ReferencePipeline.findFirst(ReferencePipeline.java:464) ~[na:1.8.0_66]
    at org.everit.json.schema.ObjectSchema.testAdditionalProperties(ObjectSchema.java:279) ~[restheart.jar:1.2.0-SNAPSHOT]
    at org.everit.json.schema.ObjectSchema.validate(ObjectSchema.java:388) ~[restheart.jar:1.2.0-SNAPSHOT]
    at org.everit.json.schema.CombinedSchema.succeeds(CombinedSchema.java:155) ~[restheart.jar:1.2.0-SNAPSHOT]
    at org.everit.json.schema.CombinedSchema.lambda$validate$38(CombinedSchema.java:165) ~[restheart.jar:1.2.0-SNAPSHOT]
    at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:174) ~[na:1.8.0_66]
    at java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948) ~[na:1.8.0_66]
    at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) ~[na:1.8.0_66]
    at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471) ~[na:1.8.0_66]
    at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) ~[na:1.8.0_66]
    at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[na:1.8.0_66]
    at java.util.stream.LongPipeline.reduce(LongPipeline.java:438) ~[na:1.8.0_66]
    at java.util.stream.LongPipeline.sum(LongPipeline.java:396) ~[na:1.8.0_66]
    at java.util.stream.ReferencePipeline.count(ReferencePipeline.java:526) ~[na:1.8.0_66]
    at org.everit.json.schema.CombinedSchema.validate(CombinedSchema.java:166) ~[restheart.jar:1.2.0-SNAPSHOT]
    at org.everit.json.schema.ReferenceSchema.validate(ReferenceSchema.java:61) ~[restheart.jar:1.2.0-SNAPSHOT]
    at org.everit.json.schema.CombinedSchema.succeeds(CombinedSchema.java:155) ~[restheart.jar:1.2.0-SNAPSHOT]
    at org.everit.json.schema.CombinedSchema.lambda$validate$38(CombinedSchema.java:165) ~[restheart.jar:1.2.0-SNAPSHOT]
    at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:174) ~[na:1.8.0_66]
    at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1374) ~[na:1.8.0_66]
    at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) ~[na:1.8.0_66]
    at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471) ~[na:1.8.0_66]
    at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) ~[na:1.8.0_66]
    at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[na:1.8.0_66]
    at java.util.stream.LongPipeline.reduce(LongPipeline.java:438) ~[na:1.8.0_66]
    at java.util.stream.LongPipeline.sum(LongPipeline.java:396) ~[na:1.8.0_66]
    at java.util.stream.ReferencePipeline.count(ReferencePipeline.java:526) ~[na:1.8.0_66]
    at org.everit.json.schema.CombinedSchema.validate(CombinedSchema.java:166) ~[restheart.jar:1.2.0-SNAPSHOT]
    at org.everit.json.schema.CombinedSchema.succeeds(CombinedSchema.java:155) ~[restheart.jar:1.2.0-SNAPSHOT]
    at org.everit.json.schema.CombinedSchema.lambda$validate$38(CombinedSchema.java:165) ~[restheart.jar:1.2.0-SNAPSHOT]
    at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:174) ~[na:1.8.0_66]
    at java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948) ~[na:1.8.0_66]
    at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) ~[na:1.8.0_66]
    at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471) ~[na:1.8.0_66]
    at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) ~[na:1.8.0_66]
    at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[na:1.8.0_66]
    at java.util.stream.LongPipeline.reduce(LongPipeline.java:438) ~[na:1.8.0_66]
    at java.util.stream.LongPipeline.sum(LongPipeline.java:396) ~[na:1.8.0_66]
    at java.util.stream.ReferencePipeline.count(ReferencePipeline.java:526) ~[na:1.8.0_66]
    at org.everit.json.schema.CombinedSchema.validate(CombinedSchema.java:166) ~[restheart.jar:1.2.0-SNAPSHOT]
    at org.everit.json.schema.ReferenceSchema.validate(ReferenceSchema.java:61) ~[restheart.jar:1.2.0-SNAPSHOT]
    at org.everit.json.schema.CombinedSchema.succeeds(CombinedSchema.java:155) ~[restheart.jar:1.2.0-SNAPSHOT]
    at org.everit.json.schema.CombinedSchema.lambda$validate$38(CombinedSchema.java:165) ~[restheart.jar:1.2.0-SNAPSHOT]
    at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:174) ~[na:1.8.0_66]
    at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1374) ~[na:1.8.0_66]
    at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) ~[na:1.8.0_66]
    at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471) ~[na:1.8.0_66]
    at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) ~[na:1.8.0_66]
    at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[na:1.8.0_66]
    at java.util.stream.LongPipeline.reduce(LongPipeline.java:438) ~[na:1.8.0_66]
    at java.util.stream.LongPipeline.sum(LongPipeline.java:396) ~[na:1.8.0_66]
    at java.util.stream.ReferencePipeline.count(ReferencePipeline.java:526) ~[na:1.8.0_66]
    at org.everit.json.schema.CombinedSchema.validate(CombinedSchema.java:166) ~[restheart.jar:1.2.0-SNAPSHOT]
    at org.everit.json.schema.CombinedSchema.succeeds(CombinedSchema.java:155) ~[restheart.jar:1.2.0-SNAPSHOT]
    at org.everit.json.schema.CombinedSchema.lambda$validate$38(CombinedSchema.java:165) ~[restheart.jar:1.2.0-SNAPSHOT]
ujibang commented 8 years ago

Hi @erosb

I have a custom SchemaClient so I logged the calls to the public InputStream get(String uri) method.

In my case I have a single call for the child schema. child has a $ref to parent, however this gets never loaded.

Hope this helps.

The child schema follows. _id and _etag are restheart stuff, also the id has a special format to refer to its schema store.

{
  "_id": "child",
  "$schema": "http://json-schema.org/draft-04/schema#",
  "allOf": [
    {
      "$ref": "parent"
    },
    {
      "$schema": "http://json-schema.org/draft-04/schema#",
      "required": [
        "s"
      ],
      "type": "object",
      "properties": {
        "s": {
          "type": "string"
        }
      }
    }
  ],
  "id": "schema://test/child#",
  "_etag": {
    "$oid": "5693ad9b2d174c61b41801b4"
  }
}
ujibang commented 8 years ago

I tried another test with the parent definition embedded in the same schema document.

and it works as expected.

The issue seems to arise when the _$ref_erenced schema should be loaded via SchemaClient

The schema is:

{
    "$schema": "http://json-schema.org/draft-04/schema#",
    "id": "schema://yaala/child#",
    "parent": {
        "n": {
            "type": "number"
        }
    },
    "allOf": [
        {
            "$ref": "#/parent"
        },
        {
            "type": "object",
            "properties": {
                "s": {
                    "type": "string"
                }
            },
            "required": [
                "s"
            ]
        }
    ]
}
erosb commented 8 years ago

@ujibang I've added an integration test with the original schema you reported, see in https://github.com/erosb/json-schema-validator/commit/d36081dd4b1b1f2d6451ec9a2fd0aef3f4ee528e

Now the $ref resolution looks fine for me there:

And the parent schema is loaded from http://localhost:1234/parent, that is fine, no StackOverflow occured. Now the next problem is that I had to remove "additionalProperties": false from yaala-child.json because it caused a validation failure.

It is also unclear for me what is the expected behavior in this case. The validation spec about additionalProperties 5.4.4 says that "In this case, validation of the instance depends on the property set of "properties" and "patternProperties".", not mentioning what should happen if these are nested under an allOf/anyOf/oneOf combined schema. There is no guidance if those subschemas should be merged in some way before validation, and also the test suite doesn't cover these cases either. I think that most probably the "additionalProperties" has been creating only object schemas and combined schemas weren't taken into account.

I understand that your usecase is valid to properly (and strictly) express inheritance by applying "additionalProperties" : false on an allOf schema, but due to the lack of specification about this case, I would more prefer not fixing it.

ujibang commented 8 years ago

@erosb ok to remove additionalProperties from the test schema

I'm currently checking why my tests are failing.....

the only difference I see is the format of the id, where in my case it starts with "schema://" rather than a valid protocol...

could this lead to an issue in your opinion?

ujibang commented 8 years ago

I changed the id to http://localhost:1234/test/_schemas/child# and now I don't get the StackOverflowError anymore.

However I don't think the $ref to parent should resolve to http://localhost:1234/parent but to http://localhost:1234/test/_schemas/parent

in 7.2.3

When an implementation encounters the "schema1" reference, it resolves it against the most immediate parent scope

And the most immediate parent scope in this case is IMO http://localhost:1234/test/_schemas

@erosb what it your opinion about this?

erosb commented 8 years ago

@ujibang

The following examples can be found in the specification (including the one you pointed out above):

Example source parent scope new segment resolved URI
Section 7.2.2 http://x.y.z/rootschema.json# (id) otherschema.json http://x.y.z/otherschema.json#
Section 7.2.2 http://x.y.z/otherschema.json# (id) t/inner.json#a http://x.y.z/t/inner.json#a
Section 7.2.3 http://my.site/myschema# ($ref) schema1 http://my.site/schema1#

Based on these examples I believe that the current implementation is correct (though the specification itself is very illogical, and the way you try to handle it would be much more straightforward).

So the thing causing confusion here is it isn't clear what does it mean to resolve against the most immediate parent scope. Based on the examples it turned out for me that if

then this should be resolved to http://x.y.z/aPathButNotAFragment instead of http://x.y.z/originalPath/aPathButNotAFragment

I hope it is more clear now.

ujibang commented 8 years ago

@erosb

I see what you mean. The problem in the specification is that all the examples use a one level id URL, i.e. the schema resource name are directly under the root.

Given that we find in the specification:

We need to understand what to do if the schema is not directly under the root:

Should it resolve to http://x.y.z/otherschema.json or http://x.y.z/sub/otherschema.json?

To be honest I find the latter much better even because the former would make impossible to have schemas with file name in different folders.

I also have found here the following example about the id property that makes clear that the resolving against the most immediate parent scope means use the parent URL of the schema id.

=> http://foo.bar/schemas/person.json

a JSON schema validation library would fetch person.json from http://foo.bar/schemas/person.json, even if address.json was loaded from the local filesystem.

erosb commented 8 years ago

@ujibang you convinced me :) I changed it in 02f8c32c81d431d2f6fc4b685ffec9840ccb7f5c Please make a final testing, then we will go on with the release tomorrow.

ujibang commented 8 years ago

great!

I was making some testing with ReferenceResolver as well.

I successfully tried changing handlePathIdAttr() in org.everit.json.schema.loader.internal.ReferenceResolver to

private String handlePathIdAttr() {
        try {
            return new URL(new URL(parentScope), encounteredSegment)
                    .toString();
        } catch (MalformedURLException ex) {
            return concat();
        }
    }

thanks

ujibang commented 8 years ago

I confirm that it work in my use case.

thanks again for the great support

erosb commented 8 years ago

Okey, all looks fine for me. both mvn clean install and mvn javadoc:javadoc ran without any errors. @balazs-zsoldos please release version 1.1.1 to maven central, thank you.

balazs-zsoldos commented 8 years ago

Sure?

[INFO] /tmp/json-schema-master/core/src/main/java/org/everit/json/schema/CombinedSchema.java: /tmp/json-schema-master/core/src/main/java/org/everit/json/schema/CombinedSchema.java uses or overrides a deprecated API.                     
[INFO] /tmp/json-schema-master/core/src/main/java/org/everit/json/schema/CombinedSchema.java: Recompile with -Xlint:deprecation for details.                                                                                                
[INFO] -------------------------------------------------------------                                                                                                                                                                        
[ERROR] COMPILATION ERROR :                                                                                                                                                                                                                 
[INFO] -------------------------------------------------------------                                                                                                                                                                        
[ERROR] /tmp/json-schema-master/core/src/main/java/org/everit/json/schema/loader/SchemaLoader.java:[435,15] incompatible types: inferred type does not conform to upper bound(s)                                                            
    inferred: ?                                                                                                                                                                                                                             
    upper bound(s): java.lang.Object                                                                                                                                                                                                        
[INFO] 1 error
erosb commented 8 years ago

@balazs-zsoldos it compiles with jdk1.8.0_45 or later

balazs-zsoldos commented 8 years ago

@erosb Release is on Maven Central