python-jsonschema / jsonschema

An implementation of the JSON Schema specification for Python
https://python-jsonschema.readthedocs.io
MIT License
4.51k stars 572 forks source link

JSONDecodeError / RefResolutionError validating the issued field of a CSL item #447

Closed dhimmel closed 5 years ago

dhimmel commented 5 years ago

Greetings, I'm new to using this library and would like to validate objects based on the Citation Styles Language schema as part of https://github.com/greenelab/manubot/issues/47.

In the code snippet below with Python v3.6.6 and jsonschema v2.6.0, I'm getting a JSONDecodeError / RefResolutionError. If I remove the issued object of my CSL data, there are no errors. issued is defined in the schema as:

            "issued": {
                "$ref": "date-variable" 
            },

I saw in https://github.com/Julian/jsonschema/issues/343 and https://github.com/Julian/jsonschema/pull/371 that there may be some complexities with internal reference resolution. Not sure if that is the issue here.

Code snippet

import json
import jsonschema
import requests

# Load instance
csl = r'''\
[
  {
    "issued": {
      "date-parts": [
        [
          "2014",
          "07",
          "14"
        ]
      ]
    },
    "type": "report",
    "id": "nkzHjOdS"
  }
]
'''
csl = json.loads(csl)

# Load schema
url = 'https://github.com/citation-style-language/schema/raw/4846e02f0a775a8272819204379a4f8d7f45c16c/csl-data.json'
schema = requests.get(url).json()
jsonschema.Draft3Validator.check_schema(schema)
validator = jsonschema.Draft3Validator(schema)

# Validate
errors = list(validator.iter_errors(csl))

Stacktrace / error

```python-traceback --------------------------------------------------------------------------- JSONDecodeError Traceback (most recent call last) ~/anaconda3/lib/python3.6/site-packages/jsonschema/validators.py in resolve_from_url(self, url) 382 try: --> 383 document = self.resolve_remote(url) 384 except Exception as exc: ~/anaconda3/lib/python3.6/site-packages/jsonschema/validators.py in resolve_remote(self, uri) 468 if callable(requests.Response.json): --> 469 result = requests.get(uri).json() 470 else: ~/anaconda3/lib/python3.6/site-packages/requests/models.py in json(self, **kwargs) 895 pass --> 896 return complexjson.loads(self.text, **kwargs) 897 ~/anaconda3/lib/python3.6/json/__init__.py in loads(s, encoding, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw) 353 parse_constant is None and object_pairs_hook is None and not kw): --> 354 return _default_decoder.decode(s) 355 if cls is None: ~/anaconda3/lib/python3.6/json/decoder.py in decode(self, s, _w) 338 """ --> 339 obj, end = self.raw_decode(s, idx=_w(s, 0).end()) 340 end = _w(s, end).end() ~/anaconda3/lib/python3.6/json/decoder.py in raw_decode(self, s, idx) 356 except StopIteration as err: --> 357 raise JSONDecodeError("Expecting value", s, err.value) from None 358 return obj, end JSONDecodeError: Expecting value: line 1 column 1 (char 0) During handling of the above exception, another exception occurred: RefResolutionError Traceback (most recent call last) in () 29 30 # Validate ---> 31 errors = list(validator.iter_errors(csl)) ~/anaconda3/lib/python3.6/site-packages/jsonschema/validators.py in iter_errors(self, instance, _schema) 103 104 errors = validator(self, v, instance, _schema) or () --> 105 for error in errors: 106 # set details if not already set by the called fn 107 error._set( ~/anaconda3/lib/python3.6/site-packages/jsonschema/_validators.py in items(validator, items, instance, schema) 53 if validator.is_type(items, "object"): 54 for index, item in enumerate(instance): ---> 55 for error in validator.descend(item, items, path=index): 56 yield error 57 else: ~/anaconda3/lib/python3.6/site-packages/jsonschema/validators.py in descend(self, instance, schema, path, schema_path) 119 120 def descend(self, instance, schema, path=None, schema_path=None): --> 121 for error in self.iter_errors(instance, schema): 122 if path is not None: 123 error.path.appendleft(path) ~/anaconda3/lib/python3.6/site-packages/jsonschema/validators.py in iter_errors(self, instance, _schema) 103 104 errors = validator(self, v, instance, _schema) or () --> 105 for error in errors: 106 # set details if not already set by the called fn 107 error._set( ~/anaconda3/lib/python3.6/site-packages/jsonschema/_validators.py in properties_draft3(validator, properties, instance, schema) 251 subschema, 252 path=property, --> 253 schema_path=property, 254 ): 255 yield error ~/anaconda3/lib/python3.6/site-packages/jsonschema/validators.py in descend(self, instance, schema, path, schema_path) 119 120 def descend(self, instance, schema, path=None, schema_path=None): --> 121 for error in self.iter_errors(instance, schema): 122 if path is not None: 123 error.path.appendleft(path) ~/anaconda3/lib/python3.6/site-packages/jsonschema/validators.py in iter_errors(self, instance, _schema) 103 104 errors = validator(self, v, instance, _schema) or () --> 105 for error in errors: 106 # set details if not already set by the called fn 107 error._set( ~/anaconda3/lib/python3.6/site-packages/jsonschema/_validators.py in ref(validator, ref, instance, schema) 210 yield error 211 else: --> 212 scope, resolved = validator.resolver.resolve(ref) 213 validator.resolver.push_scope(scope) 214 ~/anaconda3/lib/python3.6/site-packages/jsonschema/validators.py in resolve(self, ref) 373 def resolve(self, ref): 374 url = self._urljoin_cache(self.resolution_scope, ref) --> 375 return url, self._remote_cache(url) 376 377 def resolve_from_url(self, url): ~/anaconda3/lib/python3.6/site-packages/jsonschema/validators.py in resolve_from_url(self, url) 383 document = self.resolve_remote(url) 384 except Exception as exc: --> 385 raise RefResolutionError(exc) 386 387 return self.resolve_fragment(document, fragment) RefResolutionError: Expecting value: line 1 column 1 (char 0) ```
dhimmel commented 5 years ago

One workaround is dereference the schema before feeding it to jsonschema. I was able to do this with the jsonref package

import jsonref
schema = jsonref.load_uri(url, jsonschema=True)
Julian commented 5 years ago

This is almost certainly one of the other two issues there.

Closing this one, since it's tied up in all the other unrelated stuff, but feel free to follow those.