jhthorsen / json-validator

:cop: Validate data against a JSON schema
https://metacpan.org/release/JSON-Validator
57 stars 59 forks source link

json reference validation #65

Closed ckongEbi closed 7 years ago

ckongEbi commented 7 years ago

Hi, can I know if json reference is validated ? Since by running the validator, we realised that it does not complain, even though the reference is non-existent.

https://github.com/opentargets/json_schema/blob/master/src/evidence/literature_mining.json#L27

https://raw.githubusercontent.com/opentargets/json_schema/1.2.5/src/evidence/base.json#base_evidence/definitions/single_lit_reference

"base_evidence" doesn't exisit https://github.com/opentargets/json_schema/blob/master/src/evidence/base.json#L80 "$ref": "#/definitions/single_lit_reference"

Thanks! ck

jhthorsen commented 7 years ago

I don't understand how that's possible. Please provide a small test. This example is way to big for me to wrap my head around. Including refs from external documents are tested, but maybe there's a missing failing test.

ckongEbi commented 7 years ago

Hi, I ran with the script, which didn't complain about the missing reference, might I be missing something? thanks!

perl json_schema_validator.pl test.txt \ https://raw.githubusercontent.com/opentargets/json_schema/master/src/literature_mining.json 3 evidence PASSED schema validation! Script here: https://github.com/opentargets/json_schema/blob/master/scripts/json_schema_validator.pl

Example json: test.txt

jhthorsen commented 7 years ago

That's not a small test case. This is a small test case: https://github.com/jhthorsen/json-validator/blob/master/t/spec/with-relative-ref.json

jhthorsen commented 7 years ago

Just ran your script and got the errors below, but I have no idea if that's related to your issue or not. Again: Please provide a smaller test case. You can see examples here:

$ perl json_schema_validator.pl test.txt https://raw.githubusercontent.com/opentargets/json_schema/master/src/literature_mining.json
Validation error on evidence line 1 :   $VAR1 = bless( {
                 'path' => '/',
                 'message' => 'allOf failed: allOf failed: oneOf failed: Properties not allowed: value. Properties not allowed: value. Properties not allowed: value. Properties not allowed: value.'
               }, 'JSON::Validator::Error' );
Validation error on evidence line 2 :   $VAR1 = bless( {
                 'path' => '/',
                 'message' => 'allOf failed: allOf failed: oneOf failed: Properties not allowed: value. Properties not allowed: value. Properties not allowed: value. Properties not allowed: value.'
               }, 'JSON::Validator::Error' );
Exited with 2 evidence line(s) with errors of the total 3

I'm not saying there's no bug, I'm just saying I won't go through your spec or write a test that relies on your online spec.

mkarmona commented 7 years ago

@jhthorsen please let me shed some light on the problem. Let we define a json schema as you could read below named with-relative-ref.json:

{
  "type": "object",
  "properties": {
    "age": { "$ref": "https://raw.githubusercontent.com/jhthorsen/json-validator/master/t/spec/person.json#/definitions/ages" }
  }
}

Please note #definitions/ages . It was done on propouse. May I use my own python validator script defined as:

import sys
import jsonschema as jss
import json

schema = 'with-relative-ref.json'

def build_validator(schema):
    with open(schema) as fh:
        js_schema = json.load(fh)

    validator = jss.validators.validator_for(js_schema)
    return validator(schema=js_schema)

def main(filename, *args):
    v4validator = build_validator(schema)

    example = json.load(open(filename))
    errors = [str(e) for e in v4validator.iter_errors(example)]

    print '\n'.join(errors)

if __name__ == '__main__':
    main(*(sys.argv[1:]))

I got this exception using this json example file { "age": 12 }

Traceback (most recent call last):
  File "test.py", line 24, in <module>
    main(*(sys.argv[1:]))
  File "test.py", line 19, in main
    errors = [str(e) for e in v4validator.iter_errors(example)]
  File "/home/mkarmona/.virtualenvs/data_pipeline_refactor/local/lib/python2.7/site-packages/jsonschema/validators.py", line 105, in iter_errors
    for error in errors:
  File "/home/mkarmona/.virtualenvs/data_pipeline_refactor/local/lib/python2.7/site-packages/jsonschema/_validators.py", line 304, in properties_draft4
    schema_path=property,
  File "/home/mkarmona/.virtualenvs/data_pipeline_refactor/local/lib/python2.7/site-packages/jsonschema/validators.py", line 121, in descend
    for error in self.iter_errors(instance, schema):
  File "/home/mkarmona/.virtualenvs/data_pipeline_refactor/local/lib/python2.7/site-packages/jsonschema/validators.py", line 105, in iter_errors
    for error in errors:
  File "/home/mkarmona/.virtualenvs/data_pipeline_refactor/local/lib/python2.7/site-packages/jsonschema/_validators.py", line 212, in ref
    scope, resolved = validator.resolver.resolve(ref)
  File "/home/mkarmona/.virtualenvs/data_pipeline_refactor/local/lib/python2.7/site-packages/jsonschema/validators.py", line 375, in resolve
    return url, self._remote_cache(url)
  File "/home/mkarmona/.virtualenvs/data_pipeline_refactor/local/lib/python2.7/site-packages/functools32/functools32.py", line 400, in wrapper
    result = user_function(*args, **kwds)
  File "/home/mkarmona/.virtualenvs/data_pipeline_refactor/local/lib/python2.7/site-packages/jsonschema/validators.py", line 387, in resolve_from_url
    return self.resolve_fragment(document, fragment)
  File "/home/mkarmona/.virtualenvs/data_pipeline_refactor/local/lib/python2.7/site-packages/jsonschema/validators.py", line 421, in resolve_fragment
    "Unresolvable JSON pointer: %r" % fragment
jsonschema.exceptions.RefResolutionError: Unresolvable JSON pointer: u'definitions/ages'

If I correct the typo I will get a clean pass and everything will be ok at the end. May I suggest you testing this just to double check we are not wrong about your code is muting this sort of errors? Thank you very much in advance!

mkarmona commented 7 years ago

Also, could you please change the tag to 'bug'? just because I tried to report one and it is not a mere question. Best

wbazant commented 7 years ago

@jhthorsen I tried to shrink my original test case. I got somewhere, which isn't too far because I can only reproduce a problem with two objects, referencing each other remotely, having ids.

I am using the validator I wrote - I thought about making a PR with a failing test, I can if you strongly prefer this but:

#get our validator
curl https://raw.githubusercontent.com/opentargets/json_schema/master/scripts/json_schema_validator.pl > /var/tmp/json_schema_validator.pl

An example that reproduces the problem - and that ids are somehow important, because similar code differing just by an id fails:

#good:
echo '{"method":{"okay_property":""}}' | perl /var/tmp/json_schema_validator.pl --schema https://gist.githubusercontent.com/wbazant/9bf17535426bdb46395034734d0a10c5/raw/71591b64af38349d9e8612290329a0ba71cd1fd7/madness.json

#errors out:    /method: Properties not allowed: okay_property.
echo '{"method":{"okay_property":""}}' | perl /var/tmp/json_schema_validator.pl --schema https://gist.githubusercontent.com/wbazant/9bf17535426bdb46395034734d0a10c5/raw/65270d25b61b0072bd07a1957b867e2e6521a261/madness.json

# The two schema versions host two different references
diff <( curl https://gist.githubusercontent.com/wbazant/9bf17535426bdb46395034734d0a10c5/raw/71591b64af38349d9e8612290329a0ba71cd1fd7/madness.json ) <( curl https://gist.githubusercontent.com/wbazant/9bf17535426bdb46395034734d0a10c5/raw/65270d25b61b0072bd07a1957b867e2e6521a261/madness.json )
#<             "$ref":"https://gist.githubusercontent.com/wbazant/2cd464750322f065d329c4f75b1fb799/raw/59214a2d967242efe7518ce25b0b444511a4f639/method.json"
#>             "$ref":"https://gist.githubusercontent.com/wbazant/2cd464750322f065d329c4f75b1fb799/raw/b69217cb0222ae36cce9e16a515d252d5c35e545/method.json"

# the references only differ by an id
diff <( curl https://gist.githubusercontent.com/wbazant/2cd464750322f065d329c4f75b1fb799/raw/59214a2d967242efe7518ce25b0b444511a4f639/method.json) <( curl https://gist.githubusercontent.com/wbazant/2cd464750322f065d329c4f75b1fb799/raw/b69217cb0222ae36cce9e16a515d252d5c35e545/method.json) 
# >   "id":"#method",
wbazant commented 7 years ago

I've read the test suite more and realised I can do the same with unit tests and more sanity.

These two tests reproduce the problem, one succeeds and one fails

I will have a go at diving in and trying to fix the bug if I'm not constrained by time/skill.

wbazant commented 7 years ago

I debugged some more and found the reason for this - in _store there is

$namespace = Mojo::URL->new($namespace)->fragment(undef)->to_string;

which seems to try address resolving www.example.com# and www.example.com to the same uri. Unfortunately it is incorrect, in particular it causes namespaces #.* to be all keyed with the empty string. Commenting it out breaks some other tests.

The commits in https://github.com/wbazant/json-validator/commits/devel try to show what the problem is and where. I will probably not try fix this any soon, it seems a bit daunting and I don't understand the original solution well enough but maybe someone who reads this will! :)

jhthorsen commented 7 years ago

I think #80 fixes this. Please try it out.